Major bottleneck fixed! (almost)

View: New views
19 Messages — Rating Filter:   Alert me  

Major bottleneck fixed! (almost)

by Steve Shreeve :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

jruby-dev,

I have a Ruby script that does a lot of file seeking. Under MRI, the average of 10 runs is about 0.55 secs. Under JRuby, it was hovering around 15 secs.

Tonight, headius dropped in a little patch so that the openInternal() method of RubyFile.java would use NIO. The times dropped to 1.8 secs for JRuby, effectively running more than 8 times as fast!

While there is some work yet to do on it, it represents a HUGE improvement and when fully addressed, it could dramatically improve the overall performance of JRuby.

headius' access is limited tonight, but I'm sure he'll post something when he's able to do so.

Cheers,

Steve Shreeve

==

ps - Props to the JRuby team... you guys are really doing amazing work!

[20061118@21:49] shreeve: this week, I contacted you about the Runnable() stuff and you guys pounced on it
[20061118@21:49] shreeve: then, about the profile stuff and you pounced on it
[20061118@21:49] headius: my subscrip is through my sun email and the servers are down for maint
[20061118@21:49] headius: oh heheh, that's you
[20061118@21:49] shreeve: then, about the Comparable stuff and you pounced on it
[20061118@21:49] shreeve: then, about the NIO and you pounced on it
[20061118@21:50] headius: yeah
[20061118@21:50] headius: we like to fix things

Re: Major bottleneck fixed! (almost)

by Steve Shreeve :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

jruby-dev,

headius just updated his post, here's the pastebin link to the NIO patch I referred to:

http://pastebin.com/827879

Cheers,

Steve Shreeve

Re: Major bottleneck fixed! (almost)

by Charles Oliver Nutter-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Steve Shreeve wrote:
> jruby-dev,
>
> I have a Ruby script that does a lot of file seeking. Under MRI, the average
> of 10 runs is about 0.55 secs. Under JRuby, it was hovering around 15 secs.
>
> Tonight, headius dropped in a little patch so that the openInternal() method
> of RubyFile.java would use NIO. The times dropped to 1.8 secs for JRuby,
> effectively running more than 8 times as fast!

This is pretty strong motivation to migrate all our IO stuff to NIO
finally. This bottleneck could be hitting us in a lot of places.

However I'm no NIO expert...so I'm hoping someone out there will have
free cycles to start looking at this. My patch was very primitive, but
it enabled NIO for file reads at a substantial performance gain. Since
we already have a NIO IOHandler, it may not be much farther to go 100% NIO.

Anyone out there? Evan Buswell, you still with us? :)

--
Charles Oliver Nutter, JRuby Core Developer
Blogging on Ruby and Java @ headius.blogspot.com
Help spec out Ruby today! @ www.headius.com/rubyspec
headius@... -- charles.nutter@...

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email


Re: Major bottleneck fixed! (almost)

by Jools-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



Interesting.

Are we going to totally embrace NIO throughout the whole codebase now ?

I'm asking because I've been working on a Marshaling patch, which is all NIO based.

--Jools



On 11/19/06, Charles Oliver Nutter <charles.nutter@...> wrote:
Steve Shreeve wrote:
> jruby-dev,
>
> I have a Ruby script that does a lot of file seeking. Under MRI, the average
> of 10 runs is about 0.55 secs. Under JRuby, it was hovering around 15 secs.
>
> Tonight, headius dropped in a little patch so that the openInternal() method
> of RubyFile.java would use NIO. The times dropped to 1.8 secs for JRuby,
> effectively running more than 8 times as fast!

This is pretty strong motivation to migrate all our IO stuff to NIO
finally. This bottleneck could be hitting us in a lot of places.

However I'm no NIO expert...so I'm hoping someone out there will have
free cycles to start looking at this. My patch was very primitive, but
it enabled NIO for file reads at a substantial performance gain. Since
we already have a NIO IOHandler, it may not be much farther to go 100% NIO.

Anyone out there? Evan Buswell, you still with us? :)

--
Charles Oliver Nutter, JRuby Core Developer
Blogging on Ruby and Java @ headius.blogspot.com
Help spec out Ruby today! @ www.headius.com/rubyspec
headius@... -- charles.nutter@...

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email



Re: Major bottleneck fixed! (almost)

by Steve Shreeve :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I think the goal was to use NIO wherever possible, but the problem is that NIO has a difficult-to-master API and many have commented that it is difficult to get things right. If anyone out there has a mastery (or proficiency) with NIO, I'm sure there's a lot of great work you can do to help speed up JRuby. I know headius would love the help!

Steve Shreeve

Re: Major bottleneck fixed! (almost)

by Jools-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi Steve,

I guess if you've used the select() call in 'C' then java nio is pretty straightforward :-)

But, yes. You are right, it can be a very frustrating api.

Once I've nailed this Marshaling patch down, I'll start heading back through the IO code.
One place we can make significant perforance gains will be in the network based io, I've not even looked at that yet !

If there are any known bottlenecks I'd be happy to wade in.

--Jools

On 11/19/06, Steve Shreeve <steve.shreeve@...> wrote:

I think the goal was to use NIO wherever possible, but the problem is that
NIO has a difficult-to-master API and many have commented that it is
difficult to get things right. If anyone out there has a mastery (or
proficiency) with NIO, I'm sure there's a lot of great work you can do to
help speed up JRuby. I know headius would love the help!

Steve Shreeve

--
View this message in context: http://www.nabble.com/Major-bottleneck-fixed%21-%28almost%29-tf2662766.html#a7432486
Sent from the JRuby - Dev mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email



Re: Major bottleneck fixed! (almost)

by Steve Shreeve :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jools,

The major bottleneck we found last night was in the openInternal() method of RubyFile.java. headius did the following quick hack:

    http://pastebin.com/827879

and we saw an 8.3 times speedup! The problem is the quick hack adversely impacts writes (headius mentioned it was something related to flushing of buffers???). If you were interested in starting with the openInternal() stuff and making sure that writes are not adversely affected, that would be AWESOME!

If anyone else out there would like to chime in that would be great, too! :-)

Thanks,

Steve Shreeve

Re: Major bottleneck fixed! (almost)

by Jools-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Steve,

OK, just reviewed the code (top level). We are only using channels in the IOHandlerNIO.

We could steal an advantage by making more use of MappedByteBuffers, as in the map() call on a file channel.
Basically the file in nmap()'d into the JVM's address space, so it's not making system calls to read/modify data in the file.

Writes would also be instant as you would be modifying the file as if it were in memory.....

Mmm, I'll take a quick look at this now. Might not take too long.

--Jools



On 11/19/06, Steve Shreeve <steve.shreeve@...> wrote:

Jools,

The major bottleneck we found last night was in the openInternal() method of
RubyFile.java. headius did the following quick hack:

    http://pastebin.com/827879

and we saw an 8.3 times speedup! The problem is the quick hack adversely
impacts writes (headius mentioned it was something related to flushing of
buffers???). If you were interested in starting with the openInternal()
stuff and making sure that writes are not adversely affected, that would be
AWESOME!

If anyone else out there would like to chime in that would be great, too!
:-)

Thanks,

Steve Shreeve

--
View this message in context: http://www.nabble.com/Major-bottleneck-fixed%21-%28almost%29-tf2662766.html#a7432679
Sent from the JRuby - Dev mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email



Re: Major bottleneck fixed! (almost)

by Charles Oliver Nutter-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Steve Shreeve wrote:
> I think the goal was to use NIO wherever possible, but the problem is that
> NIO has a difficult-to-master API and many have commented that it is
> difficult to get things right. If anyone out there has a mastery (or
> proficiency) with NIO, I'm sure there's a lot of great work you can do to
> help speed up JRuby. I know headius would love the help!

FYI, headius is me :)

--
Charles Oliver Nutter, JRuby Core Developer
Blogging on Ruby and Java @ headius.blogspot.com
Help spec out Ruby today! @ www.headius.com/rubyspec
headius@... -- charles.nutter@...

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email


Re: Major bottleneck fixed! (almost)

by Charles Oliver Nutter-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jools wrote:

>
> Hi Steve,
>
> I guess if you've used the select() call in 'C' then java nio is pretty
> straightforward :-)
>
> But, yes. You are right, it can be a very frustrating api.
>
> Once I've nailed this Marshaling patch down, I'll start heading back
> through the IO code.
> One place we can make significant perforance gains will be in the
> network based io, I've not even looked at that yet !
>
> If there are any known bottlenecks I'd be happy to wade in.

Well there's the bottleneck attached to this thread...I tried a simple
benchmark to demonstrate it:

require 'benchmark'
f = File.open("some_large_file")
puts Benchmark.measure { while f.read(N); end }

Where n can be some small value. In all cases, we're far slower than
MRI, and in the Steve's case reading 8 bytes at a time, I had times go
from 18s to 3s just by doing that simple NIO change. We always knew NIO
would be faster, but I never expected this.

If you're already deep into NIO stuff with marshalling, maybe the time
to switch to NIO globally is nearly upon us.

--
Charles Oliver Nutter, JRuby Core Developer
Blogging on Ruby and Java @ headius.blogspot.com
Help spec out Ruby today! @ www.headius.com/rubyspec
headius@... -- charles.nutter@...

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email


Re: Major bottleneck fixed! (almost)

by Steve Shreeve :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Charlie,

Yep... I know! :-)

Steve

==

Charles Oliver Nutter-2 wrote:
Steve Shreeve wrote:
> I think the goal was to use NIO wherever possible, but the problem is that
> NIO has a difficult-to-master API and many have commented that it is
> difficult to get things right. If anyone out there has a mastery (or
> proficiency) with NIO, I'm sure there's a lot of great work you can do to
> help speed up JRuby. I know headius would love the help!

FYI, headius is me :)

--
Charles Oliver Nutter, JRuby Core Developer
Blogging on Ruby and Java @ headius.blogspot.com
Help spec out Ruby today! @ www.headius.com/rubyspec
headius@headius.com -- charles.nutter@sun.com

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email

Re: Major bottleneck fixed! (almost)

by Steve Shreeve :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jools,

Phenomenal! I'm happy to help test whatever you've got.

Cheers,

Steve Shreeve

Re: Major bottleneck fixed! (almost)

by Jools-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On 11/19/06, Charles Oliver Nutter <charles.nutter@...> wrote:
Jools wrote:

>
> Hi Steve,
>
> I guess if you've used the select() call in 'C' then java nio is pretty
> straightforward :-)
>
> But, yes. You are right, it can be a very frustrating api.
>
> Once I've nailed this Marshaling patch down, I'll start heading back
> through the IO code.
> One place we can make significant perforance gains will be in the
> network based io, I've not even looked at that yet !
>
> If there are any known bottlenecks I'd be happy to wade in.

Well there's the bottleneck attached to this thread...I tried a simple
benchmark to demonstrate it:

require 'benchmark'
f = File.open("some_large_file")
puts Benchmark.measure { while f.read(N); end }

Where n can be some small value. In all cases, we're far slower than
MRI, and in the Steve's case reading 8 bytes at a time, I had times go
from 18s to 3s just by doing that simple NIO change. We always knew NIO
would be faster, but I never expected this.

The gains are very significant with nio due to the fact you are getting much closer to the metal.

The gains you see from using MemoryMapped files are mindbending. I think we will be very close to MRI times. Hey, we might even beat them ;-)


If you're already deep into NIO stuff with marshalling, maybe the time
to switch to NIO globally is nearly upon us.

I think it'll produce some significant gains, however not without a little pain :-)

I'd like to get this marshaling patch completed in the next couple of days, it'll give me good basis to work back through the IO code.

...nice to have a direction :-)

--Jools





Re: Major bottleneck fixed! (almost)

by Charles Oliver Nutter-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jools wrote:
> The gains are very significant with nio due to the fact you are getting
> much closer to the metal.
>
> The gains you see from using MemoryMapped files are mindbending. I think
> we will be very close to MRI times. Hey, we might even beat them ;-)

That's what I gathered as well. Not only is it closer to the metal, it
also allows us to implement IO in the way that Ruby wants it to work;
that eliminate a bunch of overhead we have emulating low-level POSIX IO
APIs.

> I think it'll produce some significant gains, however not without a
> little pain :-)
>
> I'd like to get this marshaling patch completed in the next couple of
> days, it'll give me good basis to work back through the IO code.

Awesome...do let us know if there's anything we can do to help. You'll
be a JRuby Hero (TM) if you can get Marshalling working and NIO
well-supported.

--
Charles Oliver Nutter, JRuby Core Developer
Blogging on Ruby and Java @ headius.blogspot.com
Help spec out Ruby today! @ www.headius.com/rubyspec
headius@... -- charles.nutter@...

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email


Re: Major bottleneck fixed! (almost)

by Jools-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On 11/20/06, Charles Oliver Nutter <charles.nutter@...> wrote:
Jools wrote:
> The gains are very significant with nio due to the fact you are getting
> much closer to the metal.
>
> The gains you see from using MemoryMapped files are mindbending. I think
> we will be very close to MRI times. Hey, we might even beat them ;-)

That's what I gathered as well. Not only is it closer to the metal, it
also allows us to implement IO in the way that Ruby wants it to work;
that eliminate a bunch of overhead we have emulating low-level POSIX IO
APIs.

Well by doing things with nio you are actually using the POSIX IO functions, almost directly.

Because MRI is using the same API's it actually makes life easier to take the MRI codebase back to java. Well that's what I've found whilst trying to work backwards through the MRI code.


> I think it'll produce some significant gains, however not without a
> little pain :-)
>
> I'd like to get this marshaling patch completed in the next couple of
> days, it'll give me good basis to work back through the IO code.

Awesome...do let us know if there's anything we can do to help. You'll
be a JRuby Hero (TM) if you can get Marshalling working and NIO
well-supported.

I'll do what I can :-)

BTW, I've not looked at the parser code, but have you used nio there too ?

Just a thought.....

--Jools




Re: Major bottleneck fixed! (almost)

by Charles Oliver Nutter-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jools wrote:
> BTW, I've not looked at the parser code, but have you used nio there too ?
>
> Just a thought.....

We have not, and it's worth a look. However since it's generally done as
a single bulk read and then parsed in-memory, we may or may not get as
much of a speed boost. Perhaps we could avoid hosting the source
in-memory and NIO would allow it to be just as fast from the filesystem?

--
Charles Oliver Nutter, JRuby Core Developer
Blogging on Ruby and Java @ headius.blogspot.com
Help spec out Ruby today! @ www.headius.com/rubyspec
headius@... -- charles.nutter@...

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email


Re: Major bottleneck fixed! (almost)

by Jools-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On 11/20/06, Charles Oliver Nutter <charles.nutter@...> wrote:
Jools wrote:
> BTW, I've not looked at the parser code, but have you used nio there too ?
>
> Just a thought.....

We have not, and it's worth a look. However since it's generally done as
a single bulk read and then parsed in-memory, we may or may not get as
much of a speed boost. Perhaps we could avoid hosting the source
in-memory and NIO would allow it to be just as fast from the filesystem?

I think we could use a MemoryMappedFile to make the whole file look like random access memory, without the need to allocating the memory from the heap.
I need to take a quick look at the source, this could be a quick win.....

--Jools



Re: Major bottleneck fixed! (almost)

by Wes Nakamura :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Mon, 20 Nov 2006, Jools wrote:

| On 11/20/06, Charles Oliver Nutter <charles.nutter@...> wrote:
| >
| > Jools wrote:
| > > BTW, I've not looked at the parser code, but have you used nio there too
| > ?
| > >
| > > Just a thought.....
| >
| > We have not, and it's worth a look. However since it's generally done as
| > a single bulk read and then parsed in-memory, we may or may not get as
| > much of a speed boost. Perhaps we could avoid hosting the source
| > in-memory and NIO would allow it to be just as fast from the filesystem?
| >
|
| I think we could use a MemoryMappedFile to make the whole file look like
| random access memory, without the need to allocating the memory from the
| heap.
| I need to take a quick look at the source, this could be a quick win.....

I took a quick look, but unless you treat the mapped file as bytes, you
need to use a CharsetDecoder (which I assume holds the decoded
CharBuffer in memory).  So we come back to the encoding handling
question.  Having said that, though, a few tests I did awhile back
showed that the mapped file -> decoder was still faster than using a
Reader.

Also, there are places (jarred scripts, network-stored scripts (?), etc)
where URL.openStream is used, so you can't get a FileChannel.  Some of
these could be converted to use channels if they're local files, I
guess, but I didn't go too much farther.

Wes

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email


Re: Major bottleneck fixed! (almost)

by Jools-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On 11/21/06, Wes Nakamura <wknaka@...> wrote:

On Mon, 20 Nov 2006, Jools wrote:

| On 11/20/06, Charles Oliver Nutter <charles.nutter@...> wrote:
| >
| > Jools wrote:
| > > BTW, I've not looked at the parser code, but have you used nio there too
| > ?
| > >
| > > Just a thought.....
| >
| > We have not, and it's worth a look. However since it's generally done as
| > a single bulk read and then parsed in-memory, we may or may not get as
| > much of a speed boost. Perhaps we could avoid hosting the source
| > in-memory and NIO would allow it to be just as fast from the filesystem?
| >
|
| I think we could use a MemoryMappedFile to make the whole file look like
| random access memory, without the need to allocating the memory from the
| heap.
| I need to take a quick look at the source, this could be a quick win.....

I took a quick look, but unless you treat the mapped file as bytes, you
need to use a CharsetDecoder (which I assume holds the decoded
CharBuffer in memory).  So we come back to the encoding handling
question.  Having said that, though, a few tests I did awhile back
showed that the mapped file -> decoder was still faster than using a
Reader.

This is true, however the backing buffer is still in virtual memory, so access to it will be much faster.

Also, there are places (jarred scripts, network-stored scripts (?), etc)
where URL.openStream is used, so you can't get a FileChannel.  Some of
these could be converted to use channels if they're local files, I
guess, but I didn't go too much farther.

In these situations we will have to simply use the resource as a channel. Not such a bad thing really.

I've actually made the changes this evening to move over to channels/Mapped byte buffers this evening. There are still some more changes required to get the whole thing to work, but nothing to earth shattering. However as a result of this work I think we may need to refactor a little.

--Jools