Custom Development

Why A Web Server? The Benefits of Apache

Steve Ayers

A few years ago, I was out with friends at a small club to watch a live band perform. As the night wore on, the club filled and filled to almost uncomfortable levels. Because of this, my friends and I became squeezed closer and closer to the door until our little group was standing right next to the entrance. That’s when something out of the ordinary began to happen.

A girl walked into the bar and as she filed through the entrance, she handed me her ticket to get into the club. Incredulous, I sheepishly told her that I wasn’t the bouncer and that she was free to enter, at least as far as I was concerned.

However, people waiting in line behind her didn’t hear this explanation. They simply saw me take her ticket, say something, hand the ticket back, and her walk right in. From then on, everyone began to hand me their ticket. As I saw the entertainment this was providing my infantile buddies, I began to simply accept people’s tickets and wave them in. After about 10-15 minutes of this however, a friend of mine whispered in my ear: ‘The bouncers saw you and they’re coming for us’. Needless to say, I never saw the concert that night.

Now, you may be asking yourself, ‘What does this have to do with technology?’. Well, nothing really. But, what my pointless anecdote IS about is metaphor. It is about the importance of protecting the entrance to your club. It is about the choosing the right person for the job. It is about having someone who actually knows what they’re doing let traffic through your front doors.

Web servers are that bouncer.

The difference between a web server and an application server can be a confusing thing. In fact, Googling ‘difference between web server and application server’ turned up about 341,000 results. So, if you too have been wondering what the difference was, fear not, you’re not alone.

The basic definition that you’ll read online is that 'the primary function of a web server is to deliver web pages on the request to clients'. That’s all well and good, but how is that different than any application server you use? How is that different than Tomcat? Well I'm glad you asked.

Think of an application server as an ecosystem of many different pieces and parts all used for the functioning of an application: an EJB component, a JMS component, a web container, etc. In addition, the application server provides an API to the developer for interaction with each piece and part. The web server on the other hand can either be the bouncer that opens the door to these pieces and parts or a component itself.

Take Tomcat for instance. Many confuse Tomcat as an application server, but what it actually is a servlet container that delivers dynamic content in the form of JSP pages. To be more specific, Catalina is actually the servlet container, Coyote is the web-server-esque connector which forwards the requests onto Catalina. Web pages are then delivered dynamically through Jasper, the JSP engine. So, in summary Tomcat is the sum of Catalina, Coyote, and Jasper.

Essentially, Tomcat itself can serve as a web server. Its Coyote component handles the requests, forwards them onto Catalina, which then allows Jasper to serve up a JSP. Voila: 'delivering web pages on requests to clients'. So, why shouldn't you use Tomcat?

AHA! I knew you'd ask.

Because while Tomcat may be OK with handling your requests and delivering up the pages, a proper front-facing web server such as Apache can do so much more for you. It allows you to apply a myriad of different features to the functioning of your application. Remember, just because I was able to check ticket stubs doesn't mean I was the best man for the job.

 

 

Ask Not What You Can Do For Apache

To illustrate what a web server can provide, I am going to describe a bit of the different features Apache Web Server provides. Apache is a popular open-source web server and has been the most popular open source web server on the internet for the last 15 years. My hope is that through an understanding of Apache the line between application server, web server, and a skinny impersonation of a bouncer becomes clearer.

So, what can Apache do for you?

 

 

mod_jk and the AJP Connector

A great deal of Apache's functionality is delivered in the form of modules, which are basically plugins to the main Apache architecture which extend and enhance the main Apache functionality. These modules can be downloaded individually (or built from MakeFiles) and installed into an existing architecture or baked in at installation time (which is what I recommend).

Even though Tomcat can essentially function as a standalone web server, Apache can function as a much better one. However, in an effort, to keep it all in the family, Apache has a module, mod_jk, which allows a smooth integration with a Tomcat container. Basically, it allows Apache to accept requests, then forward the requests on to the appropriate Tomcat instance, essentially somewhat mimicking the job of Coyote described above. The main advantage to using Apache however is it will allow you to run multiple Tomcat instances behind a single instance of Apache.

The basic steps to configure this are:

1. Define an AJP connector port on each Tomcat instance which will accept requests from Apache
2. Define a worker which ensures requests through Apache are routed to the proper Tomcat instance
3. Configure the Apache configuration file to invoke the correct workers at the correct time.

Let's focus on the last two since they are Apache-specific. To define a worker, you create a properties file called workers.properties and inside, define your worker using basic key-value pairs:

worker.list=myworker

worker.myworker.type=ajp13
worker.myworker.host=localhost
worker.myworker.port=8009

In other words: define a worker named myworker, which uses the AJP 1.3 protocol and forwards requests to port 8009, which is where my Tomcat instance is listening for requests from Apache.

Then, in your Apache configuration file, you hook in the worker:

JkWorkersFile /path/to/workers.properties
JkMount /* myworker

Which essentially says, 'mount all requests to any path (/*) to the worker 'myworker' defined in workers.properties.

The result is a perfectly integrated Apache and Tomcat instance. Also, remember multiple instances can be defined.

An important note to remember is that the connection between Apache and Tomcat over AJP is not secure, so if you are operating over HTTPS from the browser to Apache, it is important to remember the subsequent transmission from Apache to Tomcat is insecure. This is not necessarily a huge problem since this transmission is most likely within your internal network, but nevertheless something to keep in mind.

 

 

 

VirtualHosts

Another helpful feature of Apache is that it allows you to create virtual hosts on your server so that you can give the appearance of many different hosts all operating on the same IP address. Let's take a real-world example to illustrate the point.

Let's assume your latest pet project is a site that allows your users to enter any word at all and return the relevant matches in the Beatles canon. Type in 'France' and you're returned information about the Beatles performing in France for the first time as well as the lyrics to 'Michelle'. Type in 'Julia' and you're returned a clip of the song Julia as well as information on John Lennon's mother. All the code containing your precious algorithms is going to be hosted by a third-party hosting provider, which has, say, the IP address: 192.193.4.5. You also own the domain name, beatlesearch.com, which you bought after a brainstorm in the shower three years ago.

In addition, you want to allow your users to create JIRA tickets as they notice problems and bugs on your site. The URL for this will be issues.beatlesearch.com.

With Apache, you can create VirtualHosts on your main server that will accept requests from designated servers and route them to the correct ports. These VirtualHosts can be defined in the configuration file, which is basically where most of the configuration for Apache is done. Here is an example of how these VirtualHost definitions would look in the configuration file:

NameVirtualHost *:80

<VirtualHost *:80>
ServerName www.beatlesearch.com
JkMount /* myworker
</VirtualHost>

<VirtualHost *:80>
ServerName issues.beatlesearch.com
JkMount /* myjiraworker
</VirtualHost>


Note that what we've done here is combined the use of mod_jk with our VirtualHosts. Since JIRA comes prepackaged with its own Tomcat instance, we will be forwarding requests to multiple servlet containers. As long as each of the above ServerName values are DNS-mapped to our hosted server IP address, these VirtualHosts above will route requests to the appropriate container.

 

 

 

URL Rewriting (mod_rewrite)

One interesting feature of Apache is the ability to rewrite URLs based on regex patterns to achieve certain perceived navigation. One example, which can be extremely useful when securing your application, is the ability to rewrite specific URLs to use HTTPS based on their relative path.

For example, suppose you would like the Contact Us form of your beatlesearch brainchild to be served through HTTPS, but you want the rest of the site to use plain old HTTP. With mod_rewrite, you would add the following to your configuration file:

RewriteEngine on
RewriteCond %{REQUEST_URI} ^/login/.*
RewriteRule ^.*$ https://%{SERVER_NAME}%{REQUEST_URI} [L]

which in plain English means the following:

Turn the rewrite engine on
If the request URI matches the pattern /login/ Then
Rewrite the URL to use HTTPS using this exact server name and request URI.

REQUEST_URI and SERVER_NAME are predefined variables in mod_rewrite land which allow you to act on values at runtime. The [L] in brackets at the end of the third line means 'this is the last rule'. In addition, you can string multiple rewrite conditions together. The condition of 'AND' is implicit, while the use of 'OR' can be used by appending [OR] to the end of your condition.

All in all, it's a very easy way to use HTTPS matching only on certain relative paths. In my opinion, this is a much easier way than say the use of security-constraints in your deployment descriptor.

 

 

 

Load Balancing

Perhaps the single greatest use of a web server is the ability to load balance traffic in a cluster. Apache makes this easy through the use of two modules, mod_proxy and mod_proxy_balancer. Load balancing allows Apache to act as your bouncer, dividing traffic evenly among all members of your cluster. You have your choice of three different algorithms for configuring how loads are balanced:

1. Request Counting

This allows you to configure all the members of your cluster to receive their fair share of work based on the total number of requests this member should handle. So, a cluster configured as:

  Server1 Server2 Server3 Server4
Work Effort Factor 20 40 20 20

would result in Server2 handling twice as many requests as any other member of the cluster. Note that these values are relative, so the above is the same as:

  Server1 Server2 Server3 Server4
Work Effort Factor 1 2 1 1

 

 

2. Weighted Traffic

Weighted Traffic balancing works basically the same as Request Counting except you specify a factor representing the relative SIZE of the traffic that each node will handle in byes. For example:

  Server1 Server2 Server3 Server4
Size of Traffic Factor 20 40 20 20

This means that we want Server2 to process twice as many bytes of traffic as the other 3 nodes. Remember that this does not necessarily mean more requests, just that Server2 will handle twice as much I/O as the other nodes. Again, values are relative as in Request Counting.

 

 

3. Pending Request

Pending Request basically works according to who is busiest. Apache will route requests to the node which has the least amount of active requests. This becomes especially useful with nodes that queue requests since Apache’s load balancing algorithm will guarantee those queues stay even.

So, there you have it. A quick explanation on what Apache does and further, the real purpose behind a web server. Hopefully, this provided some clarification on how they are used in a real world application and why they can be extremely beneficial in a high-volume environment. Web servers can provide a ton of functionality to all traffic coming through your doors. Just don’t let the other bouncers see it happening.

Steve Ayers
ABOUT THE AUTHOR

Summa Alumni