[ruby-core:23063] [Bug #1332] Reading file on Windows is 500x slower then with previous Ruby version

View: New views
7 Messages — Rating Filter:   Alert me  

[ruby-core:23063] [Bug #1332] Reading file on Windows is 500x slower then with previous Ruby version

by Yui NARUSE :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Bug #1332: Reading file on Windows is 500x slower then with previous Ruby version
http://redmine.ruby-lang.org/issues/show/1332

Author: Damjan Rems
Status: Open, Priority: Normal
ruby -v: ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]

time = [Time.new]
c = ''
'aaaa'.upto('zzzz') {|e| c << e}
3.times { c << c }
time << Time.new
File.open('out.file','w') { |f| f.write(c) }
time << Time.new
c = File.open('out.file','r') { |f| f.read }
time << Time.new
0.upto(time.size - 2) {|i| p "#{i} #{time[i+1]-time[i]}" }

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"0 0.537075"
"1 0.696244"
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"0 0.551"
"1 0.133"
"2 0.087"

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close

Tried on two machines. Vista SP1 and XP SP3. Same results.

Tried with virus scanner disabled. Same results.

Tried on old Win2K P4 2.4Ghz machine without virus scanner
"0 1.0625"
"1 1.09375"
"2 111.171875"

Thats 111 seconds to read 14.623.232 bytes long file which is probably read from cache anyway.

The problem doesn't seem to exist on Linux althow I have tried only Ruby 1.9.0 version.


by
TheR


----------------------------------------
http://redmine.ruby-lang.org


[ruby-core:26505] [Bug #1332] Reading file on Windows is 500x slower then with previous Ruby version

by Yui NARUSE :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Issue #1332 has been updated by Roger Pack.


I believe this is related to other issues regarding reading files in non-binary mode being slow in 1.9

>>  a = File.open('l', 'w'); 10000000.times { a.write "abc\n" }; a.close
>> Benchmark.measure { a = File.open('l', 'r'); a.readlines; a.close }.real
=> 11.890625
>> Benchmark.measure { a = File.open('l', 'rb'); a.readlines; a.close }.real
=> 3.59375

I believe that it is doing a string conversion from one encoding ["\r\n"] to another ["\n"].

Perhaps there is a way to speed this up? (ex: special case it somehow)?

-r

refs:
http://www.ruby-forum.com/topic/182691
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/24824
----------------------------------------
http://redmine.ruby-lang.org/issues/show/1332

----------------------------------------
http://redmine.ruby-lang.org


[ruby-core:26515] Re: [Bug #1332] Reading file on Windows is 500x slower then with previous Ruby version

by U.Nakamura :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

In message "[ruby-core:26505] [Bug #1332] Reading file on Windows is 500x slower then with previous Ruby version"
    on Nov.04,2009 04:50:49, <redmine@...> wrote:
> I believe that it is doing a string conversion from one encoding ["\r\n"] to another ["\n"].

right.


> Perhaps there is a way to speed this up? (ex: special case it somehow)?

Currently, we has implemented the newline conversion as a
transcode converter, just like encoding conversion.
But the design of transcode is too general to use it such
a simple operation, as our finding.
We want to find a better mechanism which doesn't deviate
from the current design of IO...


Regards,
--
U.Nakamura <usa@...>



[ruby-core:26536] Re: [Bug #1332] Reading file on Windows is 500x slower then with previous Ruby version

by Jon M :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Currently, we has implemented the newline conversion as a
> transcode converter, just like encoding conversion.
> But the design of transcode is too general to use it such
> a simple operation, as our finding.
> We want to find a better mechanism which doesn't deviate
> from the current design of IO...

Do you think the current transcode design is also the cause of

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/24839


Jon


[ruby-core:26609] Re: [Bug #1332] Reading file on Windows is 500x slower then with previous Ruby version

by Roger Pack-6 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Do you think the current transcode design is also the cause of
>
> http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/24839

Yep that's the one.
I suppose there are a few options...
1) fix encoding transformations so that they're faster

2) [slightly anathema] have files default to 'binary' in windows (I
would prefer this anyway, but that's another story).

3) special case this transformation "somewhere" to make it faster.
Thoughts?
-r


[ruby-core:26840] [Bug #1332] Reading file on Windows is 500x slower then with previous Ruby version

by Yui NARUSE :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Issue #1332 has been updated by Roger Pack.


A temporary work around [though not actually binary compatible] appears to be

Index: ruby.c
===================================================================
--- ruby.c      (revision 25830)
+++ ruby.c      (working copy)
@@ -1484,6 +1484,7 @@
        int fd, mode = O_RDONLY;
 #if defined DOSISH || defined __CYGWIN__
        {
+           mode |= O_BINARY;
            const char *ext = strrchr(fname, '.');
            if (ext && STRCASECMP(ext, ".exe") == 0)
                mode |= O_BINARY;

This causes all ruby script files loaded to be loaded as binary.  The drawback is that if you have a ruby script that was saved as ascii and contains strings that wrap lines, those strings will have an extra "\n" in them, ex:

>> File.write 'stringy.rb', "a=\"abc\r\ndef\"; puts a.inspect"

normal ruby:
C:>ruby stringy.rb
"abc\ndef"

patched ruby:
C:\>ruby stringy.rb
"abc\r\ndef"

But if your files were saved in binary mode it will be the same.
And the slowdown is gone for now.
Hopefully a better fix can be created.
Thanks.
-r
----------------------------------------
http://redmine.ruby-lang.org/issues/show/1332

----------------------------------------
http://redmine.ruby-lang.org


[ruby-core:26884] Re: [Bug #1332] Reading file on Windows is 500x slower then with previous Ruby version

by U.Nakamura :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

In message "[ruby-core:26840] [Bug #1332] Reading file on Windows is 500x slower then with previous Ruby version"
    on Nov.21,2009 08:10:45, <redmine@...> wrote:
> This causes all ruby script files loaded to be loaded as binary.  The drawback is that if you have a ruby script that was saved as ascii and contains strings that wrap lines, those strings will have an extra "\n" in them, ex:

pseudo-IO DATA recognizes the script file as data file.
So, changing default mode breaks the compatibility of such
scripts.


Regards,
--
U.Nakamura <usa@...>