I've been using ab (ApacheBench - comes with Apache httpd) lately to do some performance benchmarking of our internal web services at work. It is nice & simple to use, but unfortunately it is limited to only requesting the same URL over and over. For some services, such as a search engine that normally receives different query parameters with every request, this does not really represent reality.
I have created a patch for ab that gives it a new option (-R). This allows you to specify a file and ab will append lines from the file to the base URL for every request, in the order they are read from the file. If ab reaches the end of the file before the test is finished it will return to the first line and repeat them all.
An example explains this better.
Out of the box you may use ab to benchmark the speed of your site's search:
This will cause ab to send 5000 requests to the specified URL. Handy, but it is testing the same query over & over, which is not what the site would see in practice.
Instead, you could use the -R patch, by first creating a file (let's call it requests.txt) containing something like:
and running ab with:
As ab constructs a query it will fetch the next line from requests.txt and append it to the base URL and that becomes the query to use for that request. In this example it would query the URLs:
and so on.
This is much more useful, at least for the types of benchmarks I want to do.
You can find the ab patch here.
I have created a patch for ab that gives it a new option (-R). This allows you to specify a file and ab will append lines from the file to the base URL for every request, in the order they are read from the file. If ab reaches the end of the file before the test is finished it will return to the first line and repeat them all.
An example explains this better.
Out of the box you may use ab to benchmark the speed of your site's search:
$ ab -n 5000 http://www.something/search?q=ipod
This will cause ab to send 5000 requests to the specified URL. Handy, but it is testing the same query over & over, which is not what the site would see in practice.
Instead, you could use the -R patch, by first creating a file (let's call it requests.txt) containing something like:
ipod apple+iphone apple+ipod dvd+player
and running ab with:
$ ab -n 5000 -R requests.txt http://www.something/search?q=
As ab constructs a query it will fetch the next line from requests.txt and append it to the base URL and that becomes the query to use for that request. In this example it would query the URLs:
http://www.something/search?q=ipod http://www.something/search?q=apple+iphone http://www.something/search?q=apple+ipod http://www.something/search?q=dvd+player http://www.something/search?q=ipod http://www.something/search?q=apple+iphone http://www.something/search?q=dvd+player
and so on.
This is much more useful, at least for the types of benchmarks I want to do.
You can find the ab patch here.
At work we run a bunch of web applications (mostly TurboGears, CherryPy & Twisted apps) and host them behind Apache, using mod_proxy (and sometimes mod_rewrite) to present a clean URL to the outside world, but allowing each of the apps to run on their own private ports behind the scenes. Different people manage different web apps.
In front of our web farms we use hardware load balancers to handle request arbitration, which provides nice protection from servers or Apache instances going down.
The biggest problem I've had with this configuration until now is that when we need to perform maintenance on a particular web application, bringing that application down causes Apache to return an unhelpful message like "Service unavailable" to the client, as its attempt to reverse proxy the connection to the internal service fails.
For a long while I've wanted mod_proxy to be smarter, where I could tell it "hey, if the normal service you are forwarding to is not available, forward to this one instead". And "this one" would simply be the the same service running on a different peer server.
Well, that is exactly what mod_proxy_balancer in Apache 2.2 allows you to do. It goes beyond that and can provide weighted load balancing of internal services, but it also allows you to define "hot spares" which are only used if the normal service(s) are unavailable. This is what I'm using, with a config like:
This config tells Apache to proxy requests for /myapp to a web service on localhost at http://127.0.0.1:7825
If that service becomes unavailable (ie: you take it down for maintenance) then it will automatically send requests to http://10.0.0.2:7825 instead. The "status=+H" defines that member as a Hot Standby. When the default service is back on-line mod_proxy_balancer will pick that up within about 60 seconds or so and revert back to forwarding all requests to it.
The ProxyPassReverse directives are unrelated to the proxy balancing smarts, but are usually required if you want to handle redirects/etc properly.
You can also get real load balancing if you define some BalancerMember entries that aren't hot standbys. mod_proxy_balancer will balance requests across them and hot standby members won't be used until all normal members become unavailable. You can control the weighting of members and the balancing method to, if you like. See proxypass and mod_proxy_balancer docs.
In front of our web farms we use hardware load balancers to handle request arbitration, which provides nice protection from servers or Apache instances going down.
The biggest problem I've had with this configuration until now is that when we need to perform maintenance on a particular web application, bringing that application down causes Apache to return an unhelpful message like "Service unavailable" to the client, as its attempt to reverse proxy the connection to the internal service fails.
For a long while I've wanted mod_proxy to be smarter, where I could tell it "hey, if the normal service you are forwarding to is not available, forward to this one instead". And "this one" would simply be the the same service running on a different peer server.
Well, that is exactly what mod_proxy_balancer in Apache 2.2 allows you to do. It goes beyond that and can provide weighted load balancing of internal services, but it also allows you to define "hot spares" which are only used if the normal service(s) are unavailable. This is what I'm using, with a config like:
# Reverse Proxy /myapp to an internal web service, with fail-over to a hot standby
<Proxy balancer://myappcluster>
BalancerMember http://127.0.0.1:7825
# the hot standby on server2
BalancerMember http://10.0.0.2:7825 status=+H
</Proxy>
<Location /myapp>
ProxyPass balancer://myappcluster
ProxyPassReverse http://127.0.0.1:7825
ProxyPassReverse http://10.0.0.2:7825
</Location>
This config tells Apache to proxy requests for /myapp to a web service on localhost at http://127.0.0.1:7825
If that service becomes unavailable (ie: you take it down for maintenance) then it will automatically send requests to http://10.0.0.2:7825 instead. The "status=+H" defines that member as a Hot Standby. When the default service is back on-line mod_proxy_balancer will pick that up within about 60 seconds or so and revert back to forwarding all requests to it.
The ProxyPassReverse directives are unrelated to the proxy balancing smarts, but are usually required if you want to handle redirects/etc properly.
You can also get real load balancing if you define some BalancerMember entries that aren't hot standbys. mod_proxy_balancer will balance requests across them and hot standby members won't be used until all normal members become unavailable. You can control the weighting of members and the balancing method to, if you like. See proxypass and mod_proxy_balancer docs.
- Location:work
