|
View:
New views
9 Messages
—
Rating Filter:
Alert me
|
|
|
Poor performance of pooled connections worse in 4.0Hi there. I'm new to the group. Just upgraded from 3.1 to 4.0 for a
high-traffic production server cluster and noticed a drop in performance. Requests are consistently taking ~40% longer. Disabling * http.connection.stalecheck* had little impact. While investigating the issue, I noticed that switching from a shared HttpClient with a ThreadSafeClientConnManager to a new simple HttpClient per request cuts down minimum and average request times dramatically (over 80%). It seems the overhead for pooling and reusing connections dwarfs the overhead of establishing HTTP connections. Is this just me? Anyone else seen this? Jared P.S. Here's some of my raw benchmarking data. These numbers are for simple GETs to http://www.google.com. The results are nearly identical for our production situation (talking to a specific low-latency, non-Google web service). My benchmark just makes 50 requests to the same URL either serially or in parallel. The timed code block is simply this: client.execute(newHttpRequest()).getEntity().consumeContent(); HttpClient 4.0 *ThreadSafeClientConnManager* N=50, avg=305.8ms, min=218, max=444 N=50, avg=323.5ms, min=221, max=564 N=50, avg=519.9ms, min=223, max=1102 N=50, avg=410.2ms, min=197, max=693 N=50, avg=313.0ms, min=204, max=449 *SingleClientConnManager* N=50, avg=36.1ms, min=20, max=474 N=50, avg=39.0ms, min=27, max=395 N=50, avg=37.9ms, min=28, max=368 HttpClient 3.1 (for comparison) *MultiThreadedHttpConnectionManager* N=50, avg=221.7, min=122, max=350 N=50, avg=215.3, min=133, max=303 N=50, avg=205.1, min=132, max=284 N=50, avg=170.6, min=105, max=250 N=50, avg=276.3, min=102, max=525 *SimpleHttpConnectionManager* N=50, avg=37.9, min=29, max=173 N=50, avg=29.8, min=19, max=198 N=50, avg=26.1, min=18, max=143 N=50, avg=27.7, min=18, max=147 N=50, avg=29.7, min=20, max=189 |
|
|
Re: Poor performance of pooled connections worse in 4.0Jared Jacobs wrote:
> Hi there. I'm new to the group. Just upgraded from 3.1 to 4.0 for a > high-traffic production server cluster and noticed a drop in performance. > Requests are consistently taking ~40% longer. Disabling * > http.connection.stalecheck* had little impact. > > While investigating the issue, I noticed that switching from a shared > HttpClient with a ThreadSafeClientConnManager to a new simple HttpClient per > request cuts down minimum and average request times dramatically (over 80%). > > It seems the overhead for pooling and reusing connections dwarfs the > overhead of establishing HTTP connections. Is this just me? Anyone else seen > this? > > Jared > Jared Have you increased the max limit on connections per host, which is set to 2 per default? Most likely your 50 worker threads spend most of their time blocked waiting for one of those two connections to become available. You can see what exactly is happening with the connection pool using the following logging config: -Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.SimpleLog -Dorg.apache.commons.logging.simplelog.showdatetime=true -Dorg.apache.commons.logging.simplelog.log.org.apache.http.impl.conn=DEBUG Hope this helps Oleg --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-users-unsubscribe@... For additional commands, e-mail: httpclient-users-help@... |
|
|
Re: Poor performance of pooled connections worse in 4.0Thanks for the response, Oleg.
Have you increased the max limit on connections per host, which is set to 2 > per default? Yes, I did increase the limit. Here's how I initialized the HttpClient: private static HttpClient newMultiThreadedHttpClient() { return new DefaultHttpClient( new ThreadSafeClientConnManager( new BasicHttpParams() .setParameter(STALE_CONNECTION_CHECK, false) .setParameter(MAX_TOTAL_CONNECTIONS, 10) .setParameter(MAX_CONNECTIONS_PER_ROUTE, new ConnPerRoute() { public int getMaxForRoute(HttpRoute route) { return 10; }}), createSchemeRegistry()), null); } Regardless of what the max connections per host limit is set to, though, at least the first request should not block at all. Notice that the *minimum* elapsed time of the N=50 requests done in each of my trial runs are all very high when using pooled connections. This means that even the first request is consistently slow. I'd be happy to send you my benchmark source code. It's a single file, 100 lines. For now, we're content using a disposable HttpClient per request. I hope to have time to profile and investigate the connection pooling issue further soon. My main reason for posting to this list was the hope that someone would be able to contradict me, ideally with measurements of their own. Regards, Jared On Thu, Nov 5, 2009 at 1:18 PM, Oleg Kalnichevski <olegk@...> wrote: > Jared Jacobs wrote: > >> Hi there. I'm new to the group. Just upgraded from 3.1 to 4.0 for a >> high-traffic production server cluster and noticed a drop in performance. >> Requests are consistently taking ~40% longer. Disabling * >> http.connection.stalecheck* had little impact. >> >> While investigating the issue, I noticed that switching from a shared >> HttpClient with a ThreadSafeClientConnManager to a new simple HttpClient >> per >> request cuts down minimum and average request times dramatically (over >> 80%). >> >> It seems the overhead for pooling and reusing connections dwarfs the >> overhead of establishing HTTP connections. Is this just me? Anyone else >> seen >> this? >> >> Jared >> >> > Jared > > Have you increased the max limit on connections per host, which is set to 2 > per default? Most likely your 50 worker threads spend most of their time > blocked waiting for one of those two connections to become available. > > You can see what exactly is happening with the connection pool using the > following logging config: > > -Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.SimpleLog > -Dorg.apache.commons.logging.simplelog.showdatetime=true > -Dorg.apache.commons.logging.simplelog.log.org.apache.http.impl.conn=DEBUG > > Hope this helps > > Oleg > > --------------------------------------------------------------------- > To unsubscribe, e-mail: httpclient-users-unsubscribe@... > For additional commands, e-mail: httpclient-users-help@... > > |
|
|
Re: Poor performance of pooled connections worse in 4.0Jared Jacobs wrote:
> Thanks for the response, Oleg. > > Have you increased the max limit on connections per host, which is set to 2 >> per default? > > > Yes, I did increase the limit. Here's how I initialized the HttpClient: > > private static HttpClient newMultiThreadedHttpClient() { > return new DefaultHttpClient( > new ThreadSafeClientConnManager( > new BasicHttpParams() > .setParameter(STALE_CONNECTION_CHECK, false) > .setParameter(MAX_TOTAL_CONNECTIONS, 10) > .setParameter(MAX_CONNECTIONS_PER_ROUTE, new ConnPerRoute() { > public int getMaxForRoute(HttpRoute route) { > return 10; > }}), > createSchemeRegistry()), > null); > } > What is the point of having 10 connection limit and using 50 worker threads? > Regardless of what the max connections per host limit is set to, though, at > least the first request should not block at all. Why? Notice that the > *minimum* elapsed > time of the N=50 requests done in each of my trial runs are all very high > when using pooled connections. This means that even the first request is > consistently slow. > All these numbers are meaningless given such a small number of requests. You should be executing 10,000 HTTP requests in order to get any meaningful performance data. > I'd be happy to send you my benchmark source code. It's a single file, 100 > lines. > Send the log of the session with connection pooling. > For now, we're content using a disposable HttpClient per request. I hope to > have time to profile and investigate the connection pooling issue further > soon. > > My main reason for posting to this list was the hope that someone would be > able to contradict me, ideally with measurements of their own. > I suspect your measurements are flawed, mainly due to unrepresentative number of requests they are based upon. Oleg > Regards, > Jared > > > On Thu, Nov 5, 2009 at 1:18 PM, Oleg Kalnichevski <olegk@...> wrote: > >> Jared Jacobs wrote: >> >>> Hi there. I'm new to the group. Just upgraded from 3.1 to 4.0 for a >>> high-traffic production server cluster and noticed a drop in performance. >>> Requests are consistently taking ~40% longer. Disabling * >>> http.connection.stalecheck* had little impact. >>> >>> While investigating the issue, I noticed that switching from a shared >>> HttpClient with a ThreadSafeClientConnManager to a new simple HttpClient >>> per >>> request cuts down minimum and average request times dramatically (over >>> 80%). >>> >>> It seems the overhead for pooling and reusing connections dwarfs the >>> overhead of establishing HTTP connections. Is this just me? Anyone else >>> seen >>> this? >>> >>> Jared >>> >>> >> Jared >> >> Have you increased the max limit on connections per host, which is set to 2 >> per default? Most likely your 50 worker threads spend most of their time >> blocked waiting for one of those two connections to become available. >> >> You can see what exactly is happening with the connection pool using the >> following logging config: >> >> -Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.SimpleLog >> -Dorg.apache.commons.logging.simplelog.showdatetime=true >> -Dorg.apache.commons.logging.simplelog.log.org.apache.http.impl.conn=DEBUG >> >> Hope this helps >> >> Oleg >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: httpclient-users-unsubscribe@... >> For additional commands, e-mail: httpclient-users-help@... >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-users-unsubscribe@... For additional commands, e-mail: httpclient-users-help@... |
|
|
Re: Poor performance of pooled connections worse in 4.0>
> What is the point of having 10 connection limit and using 50 worker > threads? > To simulate heavy load (more demand than supply). It's irrelevant. Make the numbers the same and you'll still see the latency problem. > Regardless of what the max connections per host limit is set to, though, at >> least the first request should not block at all. >> > > Why? Because the pool is empty the first time around. To be clear, I meant block *waiting for a connection from the pool to become available*. All these numbers are meaningless given such a small number of requests. >> You should be executing 10,000 HTTP requests in order to get any meaningful >> performance data. > > In our production environment, we're talking to reliable services with strict SLAs over fast, reliable connections. We do tens of thousands of requests, and as I mentioned in my first email, we saw a large increase in request times when we upgraded from httpclient 3.1 to 4.0. Some things become clearer when you isolate factors. Hence my benchmarks. Any statistician will tell you that when the minimum of 50 samples increases by 10x, it is statistically significant. The samples are coming from two very different populations. Oleg, I've sent you my benchmark source code off-list. Feel free to use it, or not. I would be interested in any performance measurements that you or someone else at Apache has done. Best regards, Jared |
|
|
Re: Poor performance of pooled connections worse in 4.0Jared Jacobs wrote:
>> What is the point of having 10 connection limit and using 50 worker >> threads? >> > > To simulate heavy load (more demand than supply). It's irrelevant. Make the > numbers the same and you'll still see the latency problem. > With 50 requests you will not even warm up the JIT compiler. You have to be doing 200,000 requests at the very least. > >> Regardless of what the max connections per host limit is set to, though, at >>> least the first request should not block at all. >>> >> Why? > > > Because the pool is empty the first time around. To be clear, I meant block > *waiting for a connection from the pool to become available*. > It takes some time to set up a pool and open a connection. > All these numbers are meaningless given such a small number of requests. >>> You should be executing 10,000 HTTP requests in order to get any meaningful >>> performance data. >> > In our production environment, we're talking to reliable services with > strict SLAs over fast, reliable connections. We do tens of thousands of > requests, and as I mentioned in my first email, we saw a large increase in > request times when we upgraded from httpclient 3.1 to 4.0. > > Some things become clearer when you isolate factors. Hence my benchmarks. > Any statistician will tell you that when the minimum of 50 samples increases > by 10x, it is statistically significant. The samples are coming from two > very different populations. > > Oleg, I've sent you my benchmark source code off-list. Feel free to use it, > or not. I would be interested in any performance measurements that you or > someone else at Apache has done. > Your benchmark is meaningless. You are comparing the execution speed of 50 request over 50 connections to 50 requests over 10 connection. WHAT IS IT _EXACTLY_ you are trying to measure with your benchmark? Here's the code I used to compare _throughput_ in terms of requests per second of different HTTP client transports. Feel free to adapt this code to your particular needs http://wiki.apache.org/HttpComponents/HttpClient3vsHttpClient4vsHttpCore Oleg > Best regards, > Jared > --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-users-unsubscribe@... For additional commands, e-mail: httpclient-users-help@... |
|
|
Re: Poor performance of pooled connections worse in 4.0Thanks again for your response, Oleg.
We're dealing with a latency problem. The total duration of each and every request needs to be reliably small. Throughput isn't very important in our application. I'll get to the bottom of the issue with some profiling. Cheers, Jared |
|
|
Re: Poor performance of pooled connections worse in 4.0Hi Jared,
I would be very interested to know if you have made any further progress on this potential problem. Thanks, Tony Jared Jacobs wrote: > Thanks again for your response, Oleg. > > We're dealing with a latency problem. The total duration of each and every > request needs to be reliably small. Throughput isn't very important in our > application. > > I'll get to the bottom of the issue with some profiling. > > Cheers, > Jared > > --------------------------------------------------------------------------------------- > Orange vous informe que cet e-mail a ete controle par l'anti-virus mail. > Aucun virus connu a ce jour par nos services n'a ete detecte. > > > --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-users-unsubscribe@... For additional commands, e-mail: httpclient-users-help@... |
|
|
Re: Poor performance of pooled connections worse in 4.0Hi Tony.
Oleg was right to question my initial benchmark. In the connection pool benchmark, I was only issuing one request per thread, and contrary to my intuition, each thread runs slowly at first, even if other threads have already run the same code (i.e. it has already been JIT compiled). Subsequent requests in each thread were much faster. It didn't take anywhere near 10,000 requests to reach stable, optimal performance, though. The second and third requests in each thread, for example, were just as fast as the rest. You might be wondering if the slowness of the first request that I'm talking about can be attributed to establishing an HTTP connection that is then reused by subsequent requests. Nope. I saw the same effect even when using a new HttpClient per request (i.e. no connection pooling or reuse). Initially, my observations on one of our production servers rose two questions in my mind: 1) Is the overhead for pooling and reusing connections greater than the overhead of establishing new HTTP connections? 2) Is the performance of pooled connections significantly worse in HttpClient 4.0 than in 3.1? After improving my benchmarks, I've concluded that, in general, the answer to both questions is "no" (as one would expect). Reusing connections increases throughput a bit when there's a steady stream of requests that need to be made to a particular host, and 4.0's ThreadSafeClientConnManager performs roughly as well as 3.1's MultiThreadedHttpConnectionManager. Our server that was having problems was both low on memory and occasionally CPU bound. I believe it was those conditions that made 4.0's ThreadSafeClientConnManager slow for us when we first switched to it in production. I'm still not certain which configuration minimizes request latency when requests need to made to a single host pretty often, but at random times. I hope to have the time to answer this question for our use case by experimentation in the coming week. Regards, Jared On Thu, Nov 12, 2009 at 5:38 AM, Tony Poppleton <tony.poppleton@...>wrote: > Hi Jared, > > I would be very interested to know if you have made any further progress on > this potential problem. > > Thanks, > Tony > > Jared Jacobs wrote: > >> Thanks again for your response, Oleg. >> >> We're dealing with a latency problem. The total duration of each and every >> request needs to be reliably small. Throughput isn't very important in our >> application. >> >> I'll get to the bottom of the issue with some profiling. >> >> Cheers, >> Jared >> >> >> --------------------------------------------------------------------------------------- >> Orange vous informe que cet e-mail a ete controle par l'anti-virus mail. >> Aucun virus connu a ce jour par nos services n'a ete detecte. >> >> >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: httpclient-users-unsubscribe@... > For additional commands, e-mail: httpclient-users-help@... > > |
| Free embeddable forum powered by Nabble | Forum Help |