[ruby-core:24058] [Bug #1696] http downloads are unuseably slow

View: New views
19 Messages — Rating Filter:   Alert me  

[ruby-core:24058] [Bug #1696] http downloads are unuseably slow

by Hongli Lai-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Bug #1696: http downloads are unuseably slow
http://redmine.ruby-lang.org/issues/show/1696

Author: Steven Hartland
Status: Open, Priority: High
ruby -v: ruby 1.8.6 (2008-08-11 patchlevel 287) [amd64-freebsd7]

I've been using a a bit of code which use open-uri and http urls to download files and have found this is unuseably slow.
Example:

require 'open-uri'
open( <http uri> )

I did some benchmarking and the download of a 157MB file from the local machine was taking over 20 seconds and using 100% where as wget for the same file only took 0.7seconds.

Digging some more and profiling the code with RubyProf revealed that during the execution of the download over 11,000 threads where being created. This I've tracked down to the net/protocol module and the Net::BufferIO::rbuf_fill method which is using a timeout block to wrap the @io.sysread(1024) call. This is clearly an extremely bad way to do this, totally crippling the performance.

Below are benchmarks and some profiles of this issue and various quick fixes I tested here
== 1.8.6 core ==
      user     system      total        real
open 15.664062   6.164062  21.828125 ( 22.810859)

 %self     total     self     wait    child    calls  name
 14.04      5.27     3.00     2.27     0.00   109426  <Class::Thread>#start (ruby_runtime:0}
 10.97     13.14     2.34     0.00    10.80   109427  <Module::Timeout>#timeout (/usr/local/lib/ruby/1.8/timeout.rb:52}
  6.65      2.52     1.42     1.10     0.00   109426  Thread#kill (ruby_runtime:0}
  5.78     21.36     1.23     0.00    20.12        1  Net::BufferedIO#read (/usr/local/lib/ruby/1.8/net/protocol.rb:79}
  5.27     14.27     1.12     0.00    13.14   109427  Object#timeout (/usr/local/lib/ruby/1.8/timeout.rb:92}
  5.12      1.11     1.09     0.02     0.00   109426  IO#sysread (ruby_runtime:0}
  4.75      2.61     1.02     0.00     1.59   109426  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
  3.40      3.62     0.73     0.00     2.89   109426  Net::ReadAdapter#call_block (/usr/local/lib/ruby/1.8/net/protocol.rb:380}
  2.52      0.54     0.54     0.00     0.00   109435  String#slice! (ruby_runtime:0}
  2.30      1.33     0.49     0.00     0.84   109415  #<Class:0x801ac6f30>#<< ((eval):1}
  2.08      0.50     0.45     0.00     0.05   109427  Class#initialize (ruby_runtime:0}
  2.08      0.98     0.45     0.00     0.54   109435  Net::BufferedIO#rbuf_consume (/usr/local/lib/ruby/1.8/net/protocol.rb:137}
  2.01      4.05     0.43     0.00     3.62   109426  Net::ReadAdapter#<< (/usr/local/lib/ruby/1.8/net/protocol.rb:371}
  1.94      0.41     0.41     0.00     0.00   109416  IO#write (ruby_runtime:0}
  1.72      1.03     0.37     0.00     0.66   109435  Class#new (ruby_runtime:0}
  1.50      0.32     0.32     0.00     0.00   328286  String#size (ruby_runtime:0}
  1.32     14.55     0.28     0.00    14.27   109426  Net::BufferedIO#rbuf_fill (/usr/local/lib/ruby/1.8/net/protocol.rb:131}
  1.21      0.84     0.26     0.00     0.58   109417  Kernel#__send__ (ruby_runtime:0}
  1.13      0.24     0.24     0.00     0.00   109429  String#<< (ruby_runtime:0}
  0.99      0.30     0.21     0.00     0.09   109428  Fixnum#== (ruby_runtime:0}
  0.84      0.18     0.18     0.00     0.00   328289  Fixnum#+ (ruby_runtime:0}
  0.77      0.16     0.16     0.00     0.00   109427  <Class::Class>#allocate (ruby_runtime:0}
  0.77      0.58     0.16     0.00     0.41   109415  IO#<< (ruby_runtime:0}
  0.73      0.28     0.16     0.00     0.12   109475  Hash#[] (ruby_runtime:0}
...

== 1.8.6 core + no timeout block ==
      user     system      total        real
open  1.757812   0.765625   2.523438 (  2.873194)

 %self     total     self     wait    child    calls  name
 21.62      2.63     2.63     0.00     0.00   109426  IO#sysread (ruby_runtime:0}
 15.72      1.91     1.91     0.00     0.00   109429  String#<< (ruby_runtime:0}
 13.98      1.70     1.70     0.00     0.00   109416  IO#write (ruby_runtime:0}
  9.56     12.16     1.16     0.00    10.99        1  Net::BufferedIO#read (/usr/local/lib/ruby/1.8/net/protocol.rb:79}
  7.83      3.73     0.95     0.00     2.77   109426  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
  5.58      4.68     0.68     0.00     4.00   109426  Net::ReadAdapter#call_block (/usr/local/lib/ruby/1.8/net/protocol.rb:385}
  3.66      4.99     0.45     0.00     4.55   109426  Net::BufferedIO#rbuf_fill (/usr/local/lib/ruby/1.8/net/protocol.rb:138}
  3.34      2.38     0.41     0.00     1.97   109415  #<Class:0x801ac82b8>#<< ((eval):1}
  3.27      0.53     0.40     0.00     0.13   109435  Net::BufferedIO#rbuf_consume (/usr/local/lib/ruby/1.8/net/protocol.rb:142}
  2.95      5.04     0.36     0.00     4.68   109426  Net::ReadAdapter#<< (/usr/local/lib/ruby/1.8/net/protocol.rb:376}
  2.25      0.27     0.27     0.00     0.00   328289  Fixnum#+ (ruby_runtime:0}
  1.54      0.19     0.19     0.00     0.00   328286  String#size (ruby_runtime:0}
  1.28      0.27     0.16     0.00     0.12   109475  Hash#[] (ruby_runtime:0}
  1.22      0.15     0.15     0.00     0.00   109431  Module#=== (ruby_runtime:0}
  1.22      1.85     0.15     0.00     1.70   109415  IO#<< (ruby_runtime:0}
  1.09      0.13     0.13     0.00     0.00   109435  String#slice! (ruby_runtime:0}
  0.96      1.97     0.12     0.00     1.85   109417  Kernel#__send__ (ruby_runtime:0}
  0.96      0.12     0.12     0.00     0.00   109426  String#length (ruby_runtime:0}
  0.96      0.12     0.12     0.00     0.00   109435  Hash#default (ruby_runtime:0}
  0.71      0.09     0.09     0.00     0.00   109439  Fixnum#< (ruby_runtime:0}
...

== 1.8.6 core + no timeout block + 10K sysread ==
      user     system      total        real
open  0.570312   1.406250   1.976562 (  2.521614)

 %self     total     self     wait    child    calls  name
 27.32      0.39     0.39     0.00     0.00    10953  String#slice! (ruby_runtime:0}
 18.58      0.27     0.27     0.00     0.00    10943  IO#write (ruby_runtime:0}
 12.02      0.17     0.17     0.00     0.00    10944  IO#sysread (ruby_runtime:0}
  8.74      1.42     0.12     0.00     1.30        1  Net::BufferedIO#read (/usr/local/lib/ruby/1.8/net/protocol.rb:79}
  7.10      0.50     0.10     0.00     0.40    10944  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
  5.46      0.37     0.08     0.00     0.29    10942  #<Class:0x801ac8b28>#<< ((eval):1}
  3.28      0.05     0.05     0.00     0.00    32840  String#size (ruby_runtime:0}
  2.73      0.55     0.04     0.00     0.52    10944  Net::ReadAdapter#call_block (/usr/local/lib/ruby/1.8/net/protocol.rb:385}
  2.73      0.04     0.04     0.00     0.00    32843  Fixnum#+ (ruby_runtime:0}
  2.73      0.59     0.04     0.00     0.55    10944  Net::ReadAdapter#<< (/usr/local/lib/ruby/1.8/net/protocol.rb:376}
  1.64      0.21     0.02     0.00     0.19    10944  Net::BufferedIO#rbuf_fill (/usr/local/lib/ruby/1.8/net/protocol.rb:138}
  1.64      0.41     0.02     0.00     0.39    10953  Net::BufferedIO#rbuf_consume (/usr/local/lib/ruby/1.8/net/protocol.rb:142}
  1.09      0.02     0.02     0.00     0.00    10944  String#length (ruby_runtime:0}
  1.09      0.02     0.02     0.00     0.00    10947  String#<< (ruby_runtime:0}
  1.09      0.02     0.02     0.00     0.00    10953  Hash#default (ruby_runtime:0}
  1.09      0.28     0.02     0.00     0.27    10942  IO#<< (ruby_runtime:0}
...

== 1.8.6 core + no timeout block + 100K sysread ==
      user     system      total        real
open  0.390625   0.484375   0.875000 (  1.013516)

 %self     total     self     wait    child    calls  name
 45.45      0.43     0.43     0.00     0.00     1098  IO#sysread (ruby_runtime:0}
 33.88      0.32     0.32     0.00     0.00     1097  IO#write (ruby_runtime:0}
  8.26      0.08     0.08     0.00     0.00     1107  String#slice! (ruby_runtime:0}
  1.65      0.45     0.02     0.00     0.44     1098  Net::BufferedIO#rbuf_fill (/usr/local/lib/ruby/1.8/net/protocol.rb:138}
  1.65      0.02     0.02     0.00     0.00     3302  String#size (ruby_runtime:0}
  1.65      0.35     0.02     0.00     0.34     1098  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
  1.65      0.93     0.02     0.00     0.91        1  Net::BufferedIO#read (/usr/local/lib/ruby/1.8/net/protocol.rb:79}
  0.83      0.01     0.01     0.00     0.00        1  URI::HTTP#initialize (/usr/local/lib/ruby/1.8/uri/http.rb:77}
  0.83      0.09     0.01     0.00     0.08     1107  Net::BufferedIO#rbuf_consume (/usr/local/lib/ruby/1.8/net/protocol.rb:142}
  0.83      0.01     0.01     0.00     0.00      357  Module#method_added (ruby_runtime:0}
  0.83      0.01     0.01     0.00     0.00       37  <Class::Dir>#[] (ruby_runtime:0}
  0.83      0.01     0.01     0.00     0.00     1101  String#<< (ruby_runtime:0}
  0.83      0.36     0.01     0.00     0.35     1098  Net::ReadAdapter#call_block (/usr/local/lib/ruby/1.8/net/protocol.rb:385}
  0.83      0.33     0.01     0.00     0.32     1096  IO#<< (ruby_runtime:0}
...

== 1.8.6 core + no timeout block + 1M sysread ==
      user     system      total        real
open  0.445312   0.335938   0.781250 (  0.913736)

 %self     total     self     wait    child    calls  name
 29.66      0.27     0.27     0.00     0.00      157  IO#sysread (ruby_runtime:0}
 28.81      0.27     0.27     0.00     0.00      166  String#slice! (ruby_runtime:0}
 28.81      0.27     0.27     0.00     0.00      156  IO#write (ruby_runtime:0}
  7.63      0.07     0.07     0.00     0.00      160  String#<< (ruby_runtime:0}
  1.69      0.30     0.02     0.00     0.28      157  Net::ReadAdapter#call_block (/usr/local/lib/ruby/1.8/net/protocol.rb:385}
  0.85      0.01     0.01     0.00     0.00       37  <Class::Dir>#[] (ruby_runtime:0}
  0.85      0.01     0.01     0.00     0.00       98  Class#inherited (ruby_runtime:0}
  0.85      0.02     0.01     0.00     0.02        2  Kernel#gem_original_require (ruby_runtime:0}
  0.85      0.28     0.01     0.00     0.27      157  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
...

== 1.8.6 core + no timeout block + 1M sysread + slice bypass ==
      user     system      total        real
open  0.093750   0.523438   0.617188 (  0.717748)

 %self     total     self     wait    child    calls  name
 66.67      0.23     0.23     0.00     0.00       52  IO#write (ruby_runtime:0}
 13.33      0.05     0.05     0.00     0.00       52  IO#sysread (ruby_runtime:0}
  4.44      0.02     0.02     0.00     0.00       55  String#<< (ruby_runtime:0}
  4.44      0.02     0.02     0.00     0.00        1  TCPSocket#initialize (ruby_runtime:0}
  2.22      0.01     0.01     0.00     0.00      101  Hash#[] (ruby_runtime:0}
  2.22      0.01     0.01     0.00     0.00       10  String#slice! (ruby_runtime:0}
  2.22      0.02     0.01     0.00     0.01        2  Kernel#gem_original_require (ruby_runtime:0}
  2.22      0.25     0.01     0.00     0.24       52  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
  2.22      0.01     0.01     0.00     0.00       37  <Class::Dir>#[] (ruby_runtime:0}
...

Note: The change does also raises questions about the core slice method not performing as well as it might for the no-op case.

== 1.8.6 core + select timeout + 1M sysread + slice bypass ==
      user     system      total        real
open  0.070312   0.226562   0.296875 (  0.898285)

 %self     total     self     wait    child    calls  name
 64.10      0.20     0.20     0.00     0.00       49  IO#write (ruby_runtime:0}
 15.38      0.05     0.05     0.00     0.00       49  IO#sysread (ruby_runtime:0}
  7.69      0.02     0.02     0.00     0.00       52  String#<< (ruby_runtime:0}
  5.13      0.23     0.02     0.00     0.21       49  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
  2.56      0.02     0.01     0.00     0.02        2  Kernel#gem_original_require (ruby_runtime:0}
  2.56      0.01     0.01     0.00     0.00       37  <Class::Dir>#[] (ruby_runtime:0}
  2.56      0.02     0.01     0.00     0.01        3  Array#each (ruby_runtime:0}
...

So overall these changes improve the speed from 22seconds to 0.7seconds in my test case making ruby comparable with wget's performance.

The patch for this last run is attached and should be compatible with existing functionality through the use of select + raise Timeout::Error instead of timeout( @read_timeout ) { .. }

Hope this helps, any questions let me know


----------------------------------------
http://redmine.ruby-lang.org


[ruby-core:24059] [Bug #1696] http downloads are unuseably slow

by Hongli Lai-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Issue #1696 has been updated by Steven Hartland.


The patch description above should read: patch for select timeout + 1M sysread + slice bypass
----------------------------------------
http://redmine.ruby-lang.org/issues/show/1696

----------------------------------------
http://redmine.ruby-lang.org


[ruby-core:24060] [Bug #1696] http downloads are unuseably slow

by Hongli Lai-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Issue #1696 has been updated by Steven Hartland.

File protocol.rb.patch added

Silly bug in original patch, fixed version here, results still consistent with original.

It may be interesting to note that this also brings net/http speed inline with that of libcurl ruby gems
open.core     15.296875   1.437500  16.734375 ( 22.416853)
open.patched   0.093750   0.218750   0.312500 (  0.340683)
taf2-curb      0.171875   0.273438   0.445312 (  0.546568)

----------------------------------------
http://redmine.ruby-lang.org/issues/show/1696

----------------------------------------
http://redmine.ruby-lang.org


[ruby-core:24061] Re: [Bug #1696] http downloads are unuseably slow

by Muhammad Ali-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Net/HTTP in 1.9.2dev is already working as you described with two exceptions:
1 - It uses nonblocking I/O
2 - It uses a 16KB buffer
The 1MB buffer you are using is too large for several reasons:
1 - You use blocking I/O calls, this works great if you are downloading from the local machine. OTOH if the server you are connecting to is slow, attempting to collect such a large buffer in a blocking manner is guaranteed to cause serious delays.
2 - If multiple threads are downloading data together (which is normal for crawlers) you will be allocating a lot of memory at once this way (mutliple strings up to 1MB each). This will propably tax the GC.
You might want to upgrade to 1.8.7 (which implements read_nonblock) and modify your patch for a non blocking behavior
oldmoe
On Sat, Jun 27, 2009 at 9:47 PM, Steven Hartland <redmine@...> wrote:
Issue #1696 has been updated by Steven Hartland.

File protocol.rb.patch added

Silly bug in original patch, fixed version here, results still consistent with original.

It may be interesting to note that this also brings net/http speed inline with that of libcurl ruby gems
open.core     15.296875   1.437500  16.734375 ( 22.416853)
open.patched   0.093750   0.218750   0.312500 (  0.340683)
taf2-curb      0.171875   0.273438   0.445312 (  0.546568)

----------------------------------------
----------------------------------------


[ruby-core:24062] Re: [Bug #1696] http downloads are unuseably slow

by Roger Pack-6 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Net/HTTP in 1.9.2dev is already working as you described with two
> exceptions:
> 1 - It uses nonblocking I/O
> 2 - It uses a 16KB buffer

does 1.9.2dev work as quickly then?


[ruby-core:24064] Re: [Bug #1696] http downloads are unuseably slow

by Muhammad Ali-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Not sure really, never tested it. 

On Sun, Jun 28, 2009 at 2:29 AM, Roger Pack <rogerdpack@...> wrote:
> Net/HTTP in 1.9.2dev is already working as you described with two
> exceptions:
> 1 - It uses nonblocking I/O
> 2 - It uses a 16KB buffer

does 1.9.2dev work as quickly then?



[ruby-core:24066] Re: [Bug #1696] http downloads are unuseably slow

by Eero Saynatkari-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Excerpts from rogerdpack's message of Sun Jun 28 02:29:12 +0300 2009:
> > Net/HTTP in 1.9.2dev is already working as you described with two
> > exceptions:
> > 1 - It uses nonblocking I/O
> > 2 - It uses a 16KB buffer
>
> does 1.9.2dev work as quickly then?

Whether it does or not, at least decreasing the buffer size
is absolutely necessary. The performance of a HTTP download
cannot be tuned based on a local, large file.


Eero
--
Magic is insufficiently advanced technology.



[ruby-core:24071] [Bug #1696] http downloads are unuseably slow

by Hongli Lai-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Issue #1696 has been updated by Steven Hartland.


As you can see from the test results a 16Kb buffer is way to small for high bandwidth connection, so as Eero mentioned this definitely needs to be documenting. N.B. I assume this has been moved to either a class variable or other accessible option.

None blocking support should be trivial, simply setting the flag on the socket should be sufficient, but with the current patch select will be achieving this anyway assuming sysread is actually using recv under the covers and returns with what ever data is available instead of waiting for size data. This is not clear from the docs as sysread, readpartial and read_nonblock contain conflicting info and in places some English which just doesn't make sense. If not just changing that one line should be sufficient.

Having a quick look at the current nightly snapshot the code:
1. It may not be ordered in the best way for performance especially on slower connections. I would suggest testing if inverting the order and allowing select to do its work before doing a none blocking read, or in fact a standard recv ( no need for none blocking, as its already guaranteed to work due to the select check ). Careful benchmarking of differing speed connections would be needed to confirm which is better.
2. slice is still in place, which could also still be causing an issue unless the underlying slice implementation has been fixed for the no-op case.

----------------------------------------
http://redmine.ruby-lang.org/issues/show/1696

----------------------------------------
http://redmine.ruby-lang.org


[ruby-core:24072] Re: [Bug #1696] http downloads are unuseably slow

by Muhammad Ali-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I believe if the descriptor is set to non blocking then sysread will return whatever data available or raise an error if there isn't any.

But non block is critical for large buffer sizes as select will return the socket as ready even if it has only one byte to read.

I would also recommend the nonblocking call happens before the select. This way fast connections will not be taxed by two system calls, while slow connections are slow anyway so they can wait for the Errno::EAGAIN exception to be handled

oldmoe

On Sun, Jun 28, 2009 at 9:26 PM, Steven Hartland <redmine@...> wrote:
Issue #1696 has been updated by Steven Hartland.


As you can see from the test results a 16Kb buffer is way to small for high bandwidth connection, so as Eero mentioned this definitely needs to be documenting. N.B. I assume this has been moved to either a class variable or other accessible option.

None blocking support should be trivial, simply setting the flag on the socket should be sufficient, but with the current patch select will be achieving this anyway assuming sysread is actually using recv under the covers and returns with what ever data is available instead of waiting for size data. This is not clear from the docs as sysread, readpartial and read_nonblock contain conflicting info and in places some English which just doesn't make sense. If not just changing that one line should be sufficient.

Having a quick look at the current nightly snapshot the code:
1. It may not be ordered in the best way for performance especially on slower connections. I would suggest testing if inverting the order and allowing select to do its work before doing a none blocking read, or in fact a standard recv ( no need for none blocking, as its already guaranteed to work due to the select check ). Careful benchmarking of differing speed connections would be needed to confirm which is better.
2. slice is still in place, which could also still be causing an issue unless the underlying slice implementation has been fixed for the no-op case.

----------------------------------------
http://redmine.ruby-lang.org/issues/show/1696

----------------------------------------
http://redmine.ruby-lang.org



[ruby-core:24075] [Bug #1696] http downloads are unuseably slow

by Hongli Lai-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Issue #1696 has been updated by Steven Hartland.


> But non block is critical for large buffer sizes as select will return the
> socket as ready even if it has only one byte to read.

This makes no difference to the socket level recv, even if the socket isn't
set as none blocking, it will always return what it has. It doesn't wait till
the data size matches size of the buffer passed, so again it depends on how the
lower level methods actually deal with this; which is not clear from the docs.

> I would also recommend the nonblocking call happens before the select. This
> way fast connections will not be taxed by two system calls, while slow
> connections are slow anyway so they can wait for the Errno::EAGAIN exception
> to be handled.

This really should be profiled instead of making assumptions. I say that as I
suspect exception handling is actually quite a bit heavier than a select call,
so if its used in any significant amount, it could cause noticeably higher CPU
load. It could go either way, but its always better to know and document why a
design decision was made than to make just make it on a supposition.

A very quick test of 10 requests for a 157MB from a high speed local connection
shows:
                        user     system      total        real
open.none-blocking      5.195312   4.023438   9.218750 ( 11.924775)
open.blocking-patched   2.312500   4.562500   6.875000 (  9.244287)

N.B. The none-blocking test was done using 1.8 install just replacing protocol.rb
with the nightly snapshot, increasing the buffer to 1MB and removing the rescue
type for compatibility.
----------------------------------------
http://redmine.ruby-lang.org/issues/show/1696

----------------------------------------
http://redmine.ruby-lang.org


[ruby-core:24081] Re: [Bug #1696] http downloads are unuseably slow

by Tanaka Akira-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

In article <4a464441bf3f7_13bd3907d016634@...>,
  Steven Hartland <redmine@...> writes:

> This I've tracked down to the net/protocol module and the Net::BufferIO::rbuf_fill method which is using a timeout block to wrap the @io.sysread(1024) call. This is clearly an extremely bad way to do this, totally crippling the performance.

There is a reson to use timeout().  It is used to fix
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=265429

Before the fix, IO.select was used in rbuf_fill.
--
Tanaka Akira


[ruby-core:24082] Re: [Bug #1696] http downloads are unuseably slow

by Eero Saynatkari-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Excerpts from Tanaka Akira's message of Mon Jun 29 21:17:58 +0300 2009:
> In article <4a464441bf3f7_13bd3907d016634@...>,
>   Steven Hartland <redmine@...> writes:
>
> > This I've tracked down to the net/protocol module and the Net::BufferIO::rbuf_fill method which is using a timeout block to wrap the @io.sysread(1024) call. This is clearly an extremely bad way to do this, totally crippling the performance.
>
> There is a reson to use timeout().  It is used to fix
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=265429
>
> Before the fix, IO.select was used in rbuf_fill.

On the face of it, using timeout does not seem like the
optimal fix to the problem...speaking of, can anyone
reproduce this issue currently, or has it gotten fixed
otherwise in the 5 intervening years?


Eero
--
Magic is insufficiently advanced technology.



[ruby-core:24085] Re: [Bug #1696] http downloads are unuseably slow

by James Gray-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jun 29, 2009, at 1:38 PM, Eero Saynatkari wrote:

> Excerpts from Tanaka Akira's message of Mon Jun 29 21:17:58 +0300  
> 2009:
>> In article <4a464441bf3f7_13bd3907d016634@...>,
>>  Steven Hartland <redmine@...> writes:
>>
>>> This I've tracked down to the net/protocol module and the  
>>> Net::BufferIO::rbuf_fill method which is using a timeout block to  
>>> wrap the @io.sysread(1024) call. This is clearly an extremely bad  
>>> way to do this, totally crippling the performance.
>>
>> There is a reson to use timeout().  It is used to fix
>> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=265429
>>
>> Before the fix, IO.select was used in rbuf_fill.
>
> On the face of it, using timeout does not seem like the
> optimal fix to the problem...speaking of, can anyone
> reproduce this issue currently, or has it gotten fixed
> otherwise in the 5 intervening years?

Well, this definitely sounds like the kind of thing I am seeing.

James Edward Gray II


[ruby-core:24086] Re: [Bug #1696] http downloads are unuseably slow

by James Gray-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jun 29, 2009, at 1:17 PM, Tanaka Akira wrote:

> In article <4a464441bf3f7_13bd3907d016634@...>,
>  Steven Hartland <redmine@...> writes:
>
>> This I've tracked down to the net/protocol module and the  
>> Net::BufferIO::rbuf_fill method which is using a timeout block to  
>> wrap the @io.sysread(1024) call. This is clearly an extremely bad  
>> way to do this, totally crippling the performance.
>
> There is a reson to use timeout().  It is used to fix
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=265429
>
> Before the fix, IO.select was used in rbuf_fill.

And Timeout does get to interrupt that?

James Edward Gray II


[ruby-core:24087] Re: [Bug #1696] http downloads are unuseably slow

by Eero Saynatkari-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Excerpts from James Gray's message of Tue Jun 30 00:12:23 +0300 2009:
>
> Well, this definitely sounds like the kind of thing I am seeing.

Which system are you on, and do you have some reasonably
small test case? Presumably requires recompiling without
the timeout?


Eero
--
Magic is insufficiently advanced technology.



[ruby-core:24088] Re: [Bug #1696] http downloads are unuseably slow

by James Gray-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jun 29, 2009, at 4:38 PM, Eero Saynatkari wrote:

> Excerpts from James Gray's message of Tue Jun 30 00:12:23 +0300 2009:
>>
>> Well, this definitely sounds like the kind of thing I am seeing.
>
> Which system are you on, and do you have some reasonably
> small test case? Presumably requires recompiling without
> the timeout?

I haven't been able to reliably reproduce this, no:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/24018

James Edward Gray II


[ruby-core:24090] Re: [Bug #1696] http downloads are unuseably slow

by Tanaka Akira-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

In article <473C3283-40AA-4079-92C0-31FA11A64BE0@...>,
  James Gray <james@...> writes:

> Well, this definitely sounds like the kind of thing I am seeing.

Do you use SSL?
--
Tanaka Akira


[ruby-core:24096] Re: [Bug #1696] http downloads are unuseably slow

by James Gray-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jun 30, 2009, at 12:05 AM, Tanaka Akira wrote:

> In article <473C3283-40AA-4079-92C0-31FA11A64BE0@...>,
>  James Gray <james@...> writes:
>
>> Well, this definitely sounds like the kind of thing I am seeing.
>
> Do you use SSL?

Yes.

James Edward Gray II


[ruby-core:24098] Re: [Bug #1696] http downloads are unuseably slow

by Young Hyun :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jun 29, 2009, at 2:12 PM, James Gray wrote:

> On Jun 29, 2009, at 1:38 PM, Eero Saynatkari wrote:
>
>> Excerpts from Tanaka Akira's message of Mon Jun 29 21:17:58 +0300  
>> 2009:
>>> In article <4a464441bf3f7_13bd3907d016634@...>,
>>> Steven Hartland <redmine@...> writes:
>>>
>>>> This I've tracked down to the net/protocol module and the  
>>>> Net::BufferIO::rbuf_fill method which is using a timeout block to  
>>>> wrap the @io.sysread(1024) call. This is clearly an extremely bad  
>>>> way to do this, totally crippling the performance.
>>>
>>> There is a reson to use timeout().  It is used to fix
>>> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=265429
>>>
>>> Before the fix, IO.select was used in rbuf_fill.
>>
>> On the face of it, using timeout does not seem like the
>> optimal fix to the problem...speaking of, can anyone
>> reproduce this issue currently, or has it gotten fixed
>> otherwise in the 5 intervening years?
>
> Well, this definitely sounds like the kind of thing I am seeing.

For what it's worth, I've been fighting a similar problem for a while  
now, trying Ruby 1.8.5, 1.8.6, and 1.8.7.  The circumstances are  
similar: TCP connection with SSL, Ruby hanging on sysread, and very  
rare failure (once every 3 months with persistent connections).

One possibility is that IO.select is returning a false readiness  
notification, so a subsequent sysread hangs.  Linux manpages note that  
select(2) "may report a socket file descriptor as 'ready for reading',  
while nevertheless a subsequent read blocks" [1], and the recommended  
solution is to work with nonblocking sockets.  (Incidentally, my  
failures were under FreeBSD.)  However, until Ruby 1.9, SSL sockets  
didn't support nonblocking I/O, so we were always susceptible to this  
problem; we just lucked out because of the rarity of this failure mode.

Another possibility, and the one I have some suspicion of, is that  
there's some pathological interaction with SSL; that is, IO.select  
indicates readiness (and there actually are some bytes available to  
read), the client invokes SSLSocket#sysread, and the SSL library does  
some low-level SSL re-negotiations (which according to the SSL  
protocol can happen at any time) and then blocks waiting on actual  
user-level data.       Or instead of a low-level SSL re-negotiation,  
perhaps the connection with the peer is lost (say, the remote box  
loses power so that there's no notification of connection loss) in an  
inopportune moment in SSLSocket#sysread (for example, in the middle of  
reading an SSL protocol message), and SSLSocket#sysread hangs since it  
has no way of detecting the connection loss (assuming SSLSocket isn't  
using keepalives).

Anyway, this is all speculation, but I've had multiple failures over  
the past 18 months, and a perusal of the Ruby core dump shows the  
exact same pattern of a hang on select(2) called underneath in the  
implementation of rb_io_wait_readable() and similar internal functions  
(I'm not talking about a hang in select(2) as a direct result of a  
user-level code invoking IO.select()).

I'm going to try using the nonblocking SSL operations under Ruby 1.9  
to see whether it prevents my particular problem.  In general, though,  
it seems safest to always use nonblocking sockets with Ruby if you're  
going to use IO.select.

  --Young

   [1] http://linux.die.net/man/2/select