[rvm-research] Slots and Object Barriers

View: New views
7 Messages — Rating Filter:   Alert me  

[rvm-research] Slots and Object Barriers

by Cristian Perfumo :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi everybody,

Looking at org.mmtk.plan.generational.Gen, I see these constants:

  public static final boolean USE_OBJECT_BARRIER_FOR_AASTORE = false; // choose between slot and object barriers
  public static final boolean USE_OBJECT_BARRIER_FOR_PUTFIELD = false; // choose between slot and object barriers


I notice that if they are set to true, the applications go faster in general. I was wondering what the tradeoff is, if any, between having them set to true and false.

Also, I'm curious about the tradeoff between USE_DISCONTIGUOUS_NURSERY true and false.

The only place where they are tested is org.mmtk.plan.generational.GenMutator, in the fastPath(...) method.

Is there any place where I can read what the difference is between using slots and object barriers?

Thank you very much.

Cristian.

------------------------------------------------------------------------------

_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Slots and Object Barriers

by Steve Blackburn :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Cristian,

On 26/06/2009, at 8:14 PM, Cristian Perfumo wrote:
Looking at org.mmtk.plan.generational.Gen, I see these constants:

  public static final boolean USE_OBJECT_BARRIER_FOR_AASTORE = false; // choose between slot and object barriers
  public static final boolean USE_OBJECT_BARRIER_FOR_PUTFIELD = false; // choose between slot and object barriers


I notice that if they are set to true, the applications go faster in general. I was wondering what the tradeoff is, if any, between having them set to true and false.

[...]

Is there any place where I can read what the difference is between using slots and object barriers?

There's a fairly simple tradeoff which Tony Hosking and I studied at some length in the following paper:


For a slot barrier, you remember the address of the newly created pointer.   You do this every time a pointer is stored into the nursery (there is no filter for uniqueness).   On the other hand, an object barrier remembers (once only) the address of any object within which a pointer to the nursery is created.   A bit is set in the header to record that the object has been remembered, and it is not subsequently remembered (until after the next nursery collection).  So it is a little more efficient in what it remembers.  However it is less precise, because only at GC time the entire remembered object must be scanned to find any pointers into the nursery, whereas with a slot barrier, only the remembered slot need be checked.   You can think of an object barrier as a special case of card marking.

Also, I'm curious about the tradeoff between USE_DISCONTIGUOUS_NURSERY true and false.

Using discontinuous nurseries allows us to fully utilize virtual memory at a minor cost (the test for whether something is in the nursery or not requires a dereference rather than a comparison against an address constant).   Last time I measured, the overhead was very small, however, when I enabled it on the head we noticed some new regressions, and I did not have time to track them all.   I may re-enable this option some time soon---it makes our system more flexible.

The only place where they are tested is org.mmtk.plan.generational.GenMutator, in the fastPath(...) method.

Right.   Small bits of code that are ubiquitous.   Those few lines of code critically affect the performance of the whole system :-)

Cheers,

--Steve


PS,  Since this list is run by researchers for researchers, we always appreciate it if you disclose your affiliation (only an issue if you're using an opaque email address such as gmail).  That way we can know which research groups we are conversing with.  Thanks.




Thank you very much.

Cristian.
------------------------------------------------------------------------------
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers


------------------------------------------------------------------------------

_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Slots and Object Barriers

by Ian Rogers (nabble) :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Cristian,

the barrier options are a legacy of the barriers friend or foe paper - this is the best place to read about their trade-offs, although the analysis is now quite dated and some of the barriers are shown to improve application performance compared to no barriers! The card mark barriers never made it into, or were removed from, Jikes RVM. This chimes with the general "remove anything we don't want to support" attitude of Jikes RVM and this attitude is the reason d'etre for the MRP project [1]. Card mark barriers are issue MRP-21 [2]. Discontiguous spaces are much less of an issue with 64bit address spaces and MRP is the only Jikes RVM based platform to provide 64bit address spaces on Intel hardware. More issues with barriers are being exposed by our ongoing integration of the JUnicorn Java OS foundation into MRP.

Regards,
Ian

[1] http://mrp.codehaus.org/
[2] http://jira.codehaus.org/browse/MRP-21


2009/6/26 Cristian Perfumo <cperfumo@...>
Hi everybody,

Looking at org.mmtk.plan.generational.Gen, I see these constants:

  public static final boolean USE_OBJECT_BARRIER_FOR_AASTORE = false; // choose between slot and object barriers
  public static final boolean USE_OBJECT_BARRIER_FOR_PUTFIELD = false; // choose between slot and object barriers


I notice that if they are set to true, the applications go faster in general. I was wondering what the tradeoff is, if any, between having them set to true and false.

Also, I'm curious about the tradeoff between USE_DISCONTIGUOUS_NURSERY true and false.

The only place where they are tested is org.mmtk.plan.generational.GenMutator, in the fastPath(...) method.

Is there any place where I can read what the difference is between using slots and object barriers?

Thank you very much.

Cristian.

------------------------------------------------------------------------------

_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers



------------------------------------------------------------------------------

_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Slots and Object Barriers

by Steve Blackburn :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Christian,

For the friend or foe barrier I implemented a card marking barrier.  
The other write barriers were pre-existing for one reason or another.  
We no longer use the zone barrier.  None of our collectors require it  
and its performance is particularly poor.

The code you refer to is relatively recent.   I was exploring the  
possibility that these different choices may favor arrays and scalars  
differently.   The trade offs are complex and vary from benchmark to  
benchmark.  In the end there was no clear cut winner.   I typically re-
evaluate issues like this from time to time to see if the general  
state of the VM has changed sufficiently to make the analysis  
different.   I should do so again soon since a lot has changed on the  
performance front recently, both within the VM, and with newer hardware.

Barriers can indeed sometimes improve performance compared to no  
barriers.   If the barrier is rarely taken (no or little work actually  
done) then sometimes the effect of introducing the barrier can induce  
small positive effects.    However in the context of adaptive  
optimization these are will fall out in the noise.

The paper was strictly concerned with the mutator effect of the  
barriers.   As I said in the previous email, the various barriers can  
make different trade offs between mutator and GC time overheads.   We  
did not attempt to address that tradeoff in the paper.   My  
implementation of card marking was thus never industrial strength.   I  
did make all the source available as a patch on my web page (follow  
the link I sent you), so you can take a look if you're interested in  
it.  The key problem with card marking is that it is less general than  
many of the other approaches.   It is OK for a two generational  
collector, but is not useful for a multi-space system where it is  
necessary to track more information than simply the binary existence  
(or not) of a pointer.   For those reasons we never decided to use  
card marking in our collectors.

I hope that helps.

Cheers,

--Steve

------------------------------------------------------------------------------
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Slots and Object Barriers

by Tony Hosking :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Admittedly, the work is now somewhat dated, but card mark barriers were compared extensively with other barriers for their effects on the collector in our OOPSLA'92 paper:



On 26 Jun 2009, at 07:35, Steve Blackburn wrote:

Hi Christian,

For the friend or foe barrier I implemented a card marking barrier.   
The other write barriers were pre-existing for one reason or another.   
We no longer use the zone barrier.  None of our collectors require it  
and its performance is particularly poor.

The code you refer to is relatively recent.   I was exploring the  
possibility that these different choices may favor arrays and scalars  
differently.   The trade offs are complex and vary from benchmark to  
benchmark.  In the end there was no clear cut winner.   I typically re-
evaluate issues like this from time to time to see if the general  
state of the VM has changed sufficiently to make the analysis  
different.   I should do so again soon since a lot has changed on the  
performance front recently, both within the VM, and with newer hardware.

Barriers can indeed sometimes improve performance compared to no  
barriers.   If the barrier is rarely taken (no or little work actually  
done) then sometimes the effect of introducing the barrier can induce  
small positive effects.    However in the context of adaptive  
optimization these are will fall out in the noise.

The paper was strictly concerned with the mutator effect of the  
barriers.   As I said in the previous email, the various barriers can  
make different trade offs between mutator and GC time overheads.   We  
did not attempt to address that tradeoff in the paper.   My  
implementation of card marking was thus never industrial strength.   I  
did make all the source available as a patch on my web page (follow  
the link I sent you), so you can take a look if you're interested in  
it.  The key problem with card marking is that it is less general than  
many of the other approaches.   It is OK for a two generational  
collector, but is not useful for a multi-space system where it is  
necessary to track more information than simply the binary existence  
(or not) of a pointer.   For those reasons we never decided to use  
card marking in our collectors.

I hope that helps.

Cheers,

--Steve

------------------------------------------------------------------------------
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers


------------------------------------------------------------------------------

_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Slots and Object Barriers

by Steve Blackburn :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 26/06/2009, at 8:14 PM, Cristian Perfumo wrote:

> I notice that if they are set to true, the applications go faster in  
> general. I was wondering what the tradeoff is, if any, between  
> having them set to true and false.

Over the weekend I ran a pretty exhaustive performance comparison on  
two platforms.   These results suggest that overall, there's nothing  
in it (at least on average over a decent set of benchmarks...)

DaCapo, jvm98 & jbb2005 each run at 2 X minimum heap size. "AA" ->  
aastore, "PF" -> putfield

Machine 1.   Intel i7.
        Total time:
                http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/i7-barriers/time.all.html
        Mutator:
                http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/i7-barriers/time.mu.all.html
        GC
                http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/i7-barriers/time.gc.all.html

Machine 2.   Intel Core 2 Quad.
        Total time:
                http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q-barriers/time.all.html
        Mutator:
                http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q-barriers/time.mu.all.html
        GC
                http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q-barriers/time.gc.all.html

I also have data for a 3 X heap size, and it is almost identical.

I hope that helps your understanding of the tradeoffs.   It was a  
useful exercise for me because it re-confirms that what we have got as  
our defaults is reasonable, at least for these two architectures.

Cheers,

--Steve

------------------------------------------------------------------------------
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Slots and Object Barriers

by Cristian Perfumo :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi!

Thank you all for the information you gave me. The results Steve provides are very interesting to look at as well.
I'll carefully look into all the references you guys gave me. It looks like there is a lot of information and answers in them.

I will probably continue with this thread later on and surely open up new ones, since I'm new to Jikes.

Thanks again.

Cristian Perfumo
PhD Student - Barcelona Supercomputing Center
Barcelona, Spain
http://www.bscmsrc.eu/people/cristian-perfumo


On Mon, Jun 29, 2009 at 1:54 PM, Steve Blackburn <Steve.Blackburn@...> wrote:
On 26/06/2009, at 8:14 PM, Cristian Perfumo wrote:

> I notice that if they are set to true, the applications go faster in
> general. I was wondering what the tradeoff is, if any, between
> having them set to true and false.

Over the weekend I ran a pretty exhaustive performance comparison on
two platforms.   These results suggest that overall, there's nothing
in it (at least on average over a decent set of benchmarks...)

DaCapo, jvm98 & jbb2005 each run at 2 X minimum heap size. "AA" ->
aastore, "PF" -> putfield

Machine 1.   Intel i7.
       Total time:
               http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/i7-barriers/time.all.html
       Mutator:
               http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/i7-barriers/time.mu.all.html
       GC
               http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/i7-barriers/time.gc.all.html

Machine 2.   Intel Core 2 Quad.
       Total time:
               http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q-barriers/time.all.html
       Mutator:
               http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q-barriers/time.mu.all.html
       GC
               http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q-barriers/time.gc.all.html

I also have data for a 3 X heap size, and it is almost identical.

I hope that helps your understanding of the tradeoffs.   It was a
useful exercise for me because it re-confirms that what we have got as
our defaults is reasonable, at least for these two architectures.

Cheers,

--Steve

------------------------------------------------------------------------------
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers


------------------------------------------------------------------------------

_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers