[rvm-research] Looking for Sources of Performance Variation

View: New views
12 Messages — Rating Filter:   Alert me  

[rvm-research] Looking for Sources of Performance Variation

by Jan Sinschek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This is a troubleshooting question. I am trying to run the dacapo benchmarks with an older revision (14775) of Jikes, using the ''perf'' test-run (I adapted it to run only the dacapo benchmarks), but the measurements turn out to be very unstable. E. g. for dacapo-fop running the 9 warum-up + 1 timed iterations for 6 executions mostly gives me results within a limited rage (barring 3% of variation), but quite a number of measurements (about one fifth) are very far off (+>10%). I have attempted a small baseline compiler modification to safe some control flow profiling (edge counters); when I run the patched VM with this code, all measurements are catapulted into the higher ballpark.

I have switched off the AOS recompilation, which apparently also ensures that there is no invocation threshold-based recompilation. I am running Ubuntu 7.04 in single user mode on a 2-core Intel machine. The configuration  is standard (profiled production build with classpath, dacapo 2006/10). I have not manually started additional system services (not even an Xvfb server to run dacapo-chart) and, as stated above, encounter the phenomenon on the out-of-the-box RVM.

Right now, I am at a loss what might cause these fluctuations, so I am interested in any advice/ideas.

Christian Sinschek,
Technische Universität Darmstadt
--
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01

------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Looking for Sources of Performance Variation

by Steve Blackburn :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Christian,

There are some very well known sources of performance variation for  
managed runtimes, such as Jikes RVM.   However, it sounds like you  
have accounted for these.   It is a little hard to tell though.   It  
might help if you can provide the following information:

- The *exact* command line used for a specific benchmark
- The *exact* results produced by one of your runs (ideally a log of  
the 60 results you report below)
- The *exact* hardware you are running on (you say it is dual core, do  
you mean Core 2 Duo?).

You should not see any significant variation if you turn the AOS  
off.   I do this fairly routinely.   Nonetheless, I'd normally take 10  
measurements even with the AOS off.  WIth the AOS on, I'd be inclined  
to take 20 measurements.   I don't think you need to time the 10th  
iteration.   Take a look at the warm-up curves in the right column of  
this page (http://dacapo.anu.edu.au/regression/perf/2006-10-MR2.html)  
and you'll see that steady state is reached earlier than that.   I  
typically use the 4th iteration.

Cheers,

--steve

On 07/05/2009, at 10:13 PM, sunai@... wrote:

> This is a troubleshooting question. I am trying to run the dacapo  
> benchmarks with an older revision (14775) of Jikes, using the  
> ''perf'' test-run (I adapted it to run only the dacapo benchmarks),  
> but the measurements turn out to be very unstable. E. g. for dacapo-
> fop running the 9 warum-up + 1 timed iterations for 6 executions  
> mostly gives me results within a limited rage (barring 3% of  
> variation), but quite a number of measurements (about one fifth) are  
> very far off (+>10%). I have attempted a small baseline compiler  
> modification to safe some control flow profiling (edge counters);  
> when I run the patched VM with this code, all measurements are  
> catapulted into the higher ballpark.
>
> I have switched off the AOS recompilation, which apparently also  
> ensures that there is no invocation threshold-based recompilation. I  
> am running Ubuntu 7.04 in single user mode on a 2-core Intel  
> machine. The configuration  is standard (profiled production build  
> with classpath, dacapo 2006/10). I have not manually started  
> additional system services (not even an Xvfb server to run dacapo-
> chart) and, as stated above, encounter the phenomenon on the out-of-
> the-box RVM.
>
> Right now, I am at a loss what might cause these fluctuations, so I  
> am interested in any advice/ideas.
>
> Christian Sinschek,
> Technische Universität Darmstadt
> --
> Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit  
> allen: http://www.gmx.net/de/go/multimessenger01
>
> ------------------------------------------------------------------------------
> The NEW KODAK i700 Series Scanners deliver under ANY circumstances!  
> Your
> production scanning environment may not be a perfect world - but  
> thanks to
> Kodak, there's a perfect scanner to get the job done! With the NEW  
> KODAK i700
> Series Scanner you'll get full speed at 300 dpi even with all image
> processing features enabled. http://p.sf.net/sfu/kodak-com
> _______________________________________________
> Jikesrvm-researchers mailing list
> Jikesrvm-researchers@...
> https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers


------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Looking for Sources of Performance Variation

by Michael Hind :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


For other sources of variation you might want to read Mytkowicz et al's ASPLOS'09 (http://www-plan.cs.colorado.edu/klipto/)  paper.

Mike
_____________________________________________________________
Michael Hind, Senior Manager, Programming Technologies Department
IBM T.J. Watson Research Center
http://www.research.ibm.com/people/h/hind
914 784-7589
My internal blog:  
http://blogs.tap.ibm.com/weblogs/hindsight



Steve Blackburn <Steve.Blackburn@...>

05/07/2009 08:24 AM
Please respond to
"General discussion of Jikes RVM design, implementation, issues,        and plans" <jikesrvm-researchers@...>

To
"General discussion of Jikes RVM design, implementation, issues,        and plans" <jikesrvm-researchers@...>
cc
Subject
Re: [rvm-research] Looking for Sources of Performance Variation





Hi Christian,

There are some very well known sources of performance variation for  
managed runtimes, such as Jikes RVM.   However, it sounds like you  
have accounted for these.   It is a little hard to tell though.   It  
might help if you can provide the following information:

- The *exact* command line used for a specific benchmark
- The *exact* results produced by one of your runs (ideally a log of  
the 60 results you report below)
- The *exact* hardware you are running on (you say it is dual core, do  
you mean Core 2 Duo?).

You should not see any significant variation if you turn the AOS  
off.   I do this fairly routinely.   Nonetheless, I'd normally take 10  
measurements even with the AOS off.  WIth the AOS on, I'd be inclined  
to take 20 measurements.   I don't think you need to time the 10th  
iteration.   Take a look at the warm-up curves in the right column of  
this page (
http://dacapo.anu.edu.au/regression/perf/2006-10-MR2.html)  
and you'll see that steady state is reached earlier than that.   I  
typically use the 4th iteration.

Cheers,

--steve

On 07/05/2009, at 10:13 PM, sunai@... wrote:

> This is a troubleshooting question. I am trying to run the dacapo  
> benchmarks with an older revision (14775) of Jikes, using the  
> ''perf'' test-run (I adapted it to run only the dacapo benchmarks),  
> but the measurements turn out to be very unstable. E. g. for dacapo-
> fop running the 9 warum-up + 1 timed iterations for 6 executions  
> mostly gives me results within a limited rage (barring 3% of  
> variation), but quite a number of measurements (about one fifth) are  
> very far off (+>10%). I have attempted a small baseline compiler  
> modification to safe some control flow profiling (edge counters);  
> when I run the patched VM with this code, all measurements are  
> catapulted into the higher ballpark.
>
> I have switched off the AOS recompilation, which apparently also  
> ensures that there is no invocation threshold-based recompilation. I  
> am running Ubuntu 7.04 in single user mode on a 2-core Intel  
> machine. The configuration  is standard (profiled production build  
> with classpath, dacapo 2006/10). I have not manually started  
> additional system services (not even an Xvfb server to run dacapo-
> chart) and, as stated above, encounter the phenomenon on the out-of-
> the-box RVM.
>
> Right now, I am at a loss what might cause these fluctuations, so I  
> am interested in any advice/ideas.
>
> Christian Sinschek,
> Technische Universität Darmstadt
> --
> Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit  
> allen:
http://www.gmx.net/de/go/multimessenger01
>
> ------------------------------------------------------------------------------
> The NEW KODAK i700 Series Scanners deliver under ANY circumstances!  
> Your
> production scanning environment may not be a perfect world - but  
> thanks to
> Kodak, there's a perfect scanner to get the job done! With the NEW  
> KODAK i700
> Series Scanner you'll get full speed at 300 dpi even with all image
> processing features enabled.
http://p.sf.net/sfu/kodak-com
> _______________________________________________
> Jikesrvm-researchers mailing list
> Jikesrvm-researchers@...
>
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers


------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image
processing features enabled.
http://p.sf.net/sfu/kodak-com
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers


------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Parent Message unknown Re: [rvm-research] Looking for Sources of Performance Variation

by Jan Sinschek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Steve,

I hope that the attached output files are suitable, but I can extract the relevant data as well instead (I just assume you have the means at hand to quickly get the data you want). The two Result files originate from the same test-run performed twice (the "1x10" in its name is a misnomer). It ran fop and xalan each 3 and 10 iterations for 6 executions. I just glanced over the warmup times, and they seemed very unstable as well.

ciao,
jan

Steve Blackburn wrote:

> Hi Christian,
>
> There are some very well known sources of performance variation for  
> managed runtimes, such as Jikes RVM.   However, it sounds like you  
> have accounted for these.   It is a little hard to tell though.   It  
> might help if you can provide the following information:
>
> - The *exact* command line used for a specific benchmark
> - The *exact* results produced by one of your runs (ideally a log of  
> the 60 results you report below)
> - The *exact* hardware you are running on (you say it is dual core, do  
> you mean Core 2 Duo?).
>
> You should not see any significant variation if you turn the AOS  
> off.   I do this fairly routinely.   Nonetheless, I'd normally take 10  
> measurements even with the AOS off.  WIth the AOS on, I'd be inclined  
> to take 20 measurements.   I don't think you need to time the 10th  
> iteration.   Take a look at the warm-up curves in the right column of  
> this page (http://dacapo.anu.edu.au/regression/perf/2006-10-MR2.html)  
> and you'll see that steady state is reached earlier than that.   I  
> typically use the 4th iteration.
>
> Cheers,
>
> --steve
>
> On 07/05/2009, at 10:13 PM, sunai@... wrote:
>
>> This is a troubleshooting question. I am trying to run the dacapo  
>> benchmarks with an older revision (14775) of Jikes, using the  
>> ''perf'' test-run (I adapted it to run only the dacapo benchmarks),  
>> but the measurements turn out to be very unstable. E. g. for dacapo-
>> fop running the 9 warum-up + 1 timed iterations for 6 executions  
>> mostly gives me results within a limited rage (barring 3% of  
>> variation), but quite a number of measurements (about one fifth) are  
>> very far off (+>10%). I have attempted a small baseline compiler  
>> modification to safe some control flow profiling (edge counters);  
>> when I run the patched VM with this code, all measurements are  
>> catapulted into the higher ballpark.
>>
>> I have switched off the AOS recompilation, which apparently also  
>> ensures that there is no invocation threshold-based recompilation. I  
>> am running Ubuntu 7.04 in single user mode on a 2-core Intel  
>> machine. The configuration  is standard (profiled production build  
>> with classpath, dacapo 2006/10). I have not manually started  
>> additional system services (not even an Xvfb server to run dacapo-
>> chart) and, as stated above, encounter the phenomenon on the out-of-
>> the-box RVM.
>>
>> Right now, I am at a loss what might cause these fluctuations, so I  
>> am interested in any advice/ideas.
>>
>> Christian Sinschek,
>> Technische Universität Darmstadt
>> --
>> Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit  
>> allen: http://www.gmx.net/de/go/multimessenger01
>>
>> ------------------------------------------------------------------------------
>> The NEW KODAK i700 Series Scanners deliver under ANY circumstances!  
>> Your
>> production scanning environment may not be a perfect world - but  
>> thanks to
>> Kodak, there's a perfect scanner to get the job done! With the NEW  
>> KODAK i700
>> Series Scanner you'll get full speed at 300 dpi even with all image
>> processing features enabled. http://p.sf.net/sfu/kodak-com
>> _______________________________________________
>> Jikesrvm-researchers mailing list
>> Jikesrvm-researchers@...
>> https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers
>
>
> ------------------------------------------------------------------------------
> The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
> production scanning environment may not be a perfect world - but thanks to
> Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
> Series Scanner you'll get full speed at 300 dpi even with all image
> processing features enabled. http://p.sf.net/sfu/kodak-com
> _______________________________________________
> Jikesrvm-researchers mailing list
> Jikesrvm-researchers@...
> https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers
>

--
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01

processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Xeon(TM) CPU 3.06GHz
stepping : 5
cpu MHz : 3056.983
cache size : 1024 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr
bogomips : 6118.37
clflush size : 64

processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Xeon(TM) CPU 3.06GHz
stepping : 5
cpu MHz : 3056.983
cache size : 1024 KB
physical id : 3
siblings : 1
core id : 0
cpu cores : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr
bogomips : 6114.11
clflush size : 64


cat: invalid option -- a
Try `cat --help' for more information.

<results version="1.1">
<name>dacapo_basel_legacy_1x10</name>
<variant>dacapo_basel_legacy_1x10</variant>
<revision>14775</revision>
<start-time>2009-05-06T16:14:19Z</start-time>
<end-time>2009-05-06T16:30:10Z</end-time>
<parameters>
</parameters>
<host>
<name>shostakovich</name>
<parameters>
</parameters>
</host>
<test-configuration>
<build-configuration>production</build-configuration>
<name>default</name>
<parameters>
<parameter key="mode" value="/home/jan/j14775/dis/production_ia32-linux"/>
<parameter key="mode" value="performance"/>
</parameters>
<group>
<name>perf-dacapo-fop</name>
<test>
<name>fop-3</name>
<command><![CDATA[cd /home/jan/j14775/target/tests/dacapo_basel_legacy_1x10/production/perf-dacapo-fop && /home/jan/j14775/dis/production_ia32-linux/rvm -X:vm:errorsFatal=true -X:processors=all -Xms174M -Xmx174M -X:gc:ignoreSystemGC=true     -classpath "/home/jan/testResources/dacapo/dacapo-2006-10-MR2.jar" Harness -c MMTkCallback -n 3 fop]]></command>
<parameters>
<parameter key='initial.heapsize' value='174'/>
<parameter key='max.heapsize' value='174'/>
<parameter key='time.limit' value='10000'/>
<parameter key='extra.args' value='-X:gc:ignoreSystemGC=true '/>
<parameter key='extra.rvm.args' value=''/>
<parameter key='processors' value='all'/>
<parameter key='max.opt.level' value=''/>
</parameters>
<test-execution><name>1</name>
<statistics>
<statistic key="time" value="3554"/>
</statistics>
<exit-code>0</exit-code>
<duration>11910</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 4405 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3570 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 3554 msec =====
]]></output>
</test-execution><test-execution><name>2</name>
<statistics>
<statistic key="time" value="2863"/>
</statistics>
<exit-code>0</exit-code>
<duration>10101</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3845 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3085 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2863 msec =====
]]></output>
</test-execution><test-execution><name>3</name>
<statistics>
<statistic key="time" value="2814"/>
</statistics>
<exit-code>0</exit-code>
<duration>9837</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3804 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2908 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2814 msec =====
]]></output>
</test-execution><test-execution><name>4</name>
<statistics>
<statistic key="time" value="2905"/>
</statistics>
<exit-code>0</exit-code>
<duration>9960</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3777 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2951 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2905 msec =====
]]></output>
</test-execution><test-execution><name>5</name>
<statistics>
<statistic key="time" value="2759"/>
</statistics>
<exit-code>0</exit-code>
<duration>9713</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3764 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2882 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2759 msec =====
]]></output>
</test-execution><test-execution><name>6</name>
<statistics>
<statistic key="time" value="2828"/>
</statistics>
<exit-code>0</exit-code>
<duration>10020</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3876 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3010 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2828 msec =====
]]></output>
</test-execution>
</test>
<test>
<name>xalan-3</name>
<command><![CDATA[cd /home/jan/j14775/target/tests/dacapo_basel_legacy_1x10/production/perf-dacapo-fop && /home/jan/j14775/dis/production_ia32-linux/rvm -X:vm:errorsFatal=true -X:processors=all -Xms258M -Xmx258M -X:gc:ignoreSystemGC=true     -classpath "/home/jan/testResources/dacapo/dacapo-2006-10-MR2.jar" Harness -c MMTkCallback -n 3 xalan]]></command>
<parameters>
<parameter key='initial.heapsize' value='258'/>
<parameter key='max.heapsize' value='258'/>
<parameter key='time.limit' value='10000'/>
<parameter key='extra.args' value='-X:gc:ignoreSystemGC=true '/>
<parameter key='extra.rvm.args' value=''/>
<parameter key='processors' value='all'/>
<parameter key='max.opt.level' value=''/>
</parameters>
<test-execution><name>1</name>
<statistics>
<statistic key="time" value="8364"/>
</statistics>
<exit-code>0</exit-code>
<duration>29995</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11692 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8963 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8364 msec =====
]]></output>
</test-execution><test-execution><name>2</name>
<statistics>
<statistic key="time" value="8386"/>
</statistics>
<exit-code>0</exit-code>
<duration>30113</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11714 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 9047 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8386 msec =====
]]></output>
</test-execution><test-execution><name>3</name>
<statistics>
<statistic key="time" value="8522"/>
</statistics>
<exit-code>0</exit-code>
<duration>29673</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11327 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8853 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8522 msec =====
]]></output>
</test-execution><test-execution><name>4</name>
<statistics>
<statistic key="time" value="8336"/>
</statistics>
<exit-code>0</exit-code>
<duration>30228</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11855 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 9043 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8336 msec =====
]]></output>
</test-execution><test-execution><name>5</name>
<statistics>
<statistic key="time" value="8328"/>
</statistics>
<exit-code>0</exit-code>
<duration>30092</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11917 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8861 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8328 msec =====
]]></output>
</test-execution><test-execution><name>6</name>
<statistics>
<statistic key="time" value="8401"/>
</statistics>
<exit-code>0</exit-code>
<duration>30193</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11680 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 9140 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8401 msec =====
]]></output>
</test-execution>
</test>
<test>
<name>fop-10</name>
<command><![CDATA[cd /home/jan/j14775/target/tests/dacapo_basel_legacy_1x10/production/perf-dacapo-fop && /home/jan/j14775/dis/production_ia32-linux/rvm -X:vm:errorsFatal=true -X:processors=all -Xms174M -Xmx174M -X:gc:ignoreSystemGC=true     -classpath "/home/jan/testResources/dacapo/dacapo-2006-10-MR2.jar" Harness -c MMTkCallback -n 10 fop]]></command>
<parameters>
<parameter key='initial.heapsize' value='174'/>
<parameter key='max.heapsize' value='174'/>
<parameter key='time.limit' value='10000'/>
<parameter key='extra.args' value='-X:gc:ignoreSystemGC=true '/>
<parameter key='extra.rvm.args' value=''/>
<parameter key='processors' value='all'/>
<parameter key='max.opt.level' value=''/>
</parameters>
<test-execution><name>1</name>
<statistics>
<statistic key="time" value="2950"/>
</statistics>
<exit-code>0</exit-code>
<duration>32436</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 4008 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3404 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3174 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3045 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3179 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2858 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2918 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2953 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3580 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2950 msec =====
]]></output>
</test-execution><test-execution><name>2</name>
<statistics>
<statistic key="time" value="2736"/>
</statistics>
<exit-code>0</exit-code>
<duration>29711</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3758 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2949 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3166 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2833 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3081 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2695 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2675 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2755 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2712 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2736 msec =====
]]></output>
</test-execution><test-execution><name>3</name>
<statistics>
<statistic key="time" value="2721"/>
</statistics>
<exit-code>0</exit-code>
<duration>29521</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3835 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2922 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2829 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2753 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2741 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2731 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2816 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2738 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3083 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2721 msec =====
]]></output>
</test-execution><test-execution><name>4</name>
<statistics>
<statistic key="time" value="2663"/>
</statistics>
<exit-code>0</exit-code>
<duration>29605</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3871 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2897 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2963 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3127 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2749 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2813 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2706 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2736 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2727 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2663 msec =====
]]></output>
</test-execution><test-execution><name>5</name>
<statistics>
<statistic key="time" value="2732"/>
</statistics>
<exit-code>0</exit-code>
<duration>29828</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3769 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2910 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2817 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3137 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2751 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2787 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2785 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2706 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3068 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2732 msec =====
]]></output>
</test-execution><test-execution><name>6</name>
<statistics>
<statistic key="time" value="2662"/>
</statistics>
<exit-code>0</exit-code>
<duration>29751</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3791 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2908 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2900 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3210 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2927 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2768 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2777 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2757 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2705 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2662 msec =====
]]></output>
</test-execution>
</test>
<test>
<name>xalan-10</name>
<command><![CDATA[cd /home/jan/j14775/target/tests/dacapo_basel_legacy_1x10/production/perf-dacapo-fop && /home/jan/j14775/dis/production_ia32-linux/rvm -X:vm:errorsFatal=true -X:processors=all -Xms258M -Xmx258M -X:gc:ignoreSystemGC=true     -classpath "/home/jan/testResources/dacapo/dacapo-2006-10-MR2.jar" Harness -c MMTkCallback -n 10 xalan]]></command>
<parameters>
<parameter key='initial.heapsize' value='258'/>
<parameter key='max.heapsize' value='258'/>
<parameter key='time.limit' value='10000'/>
<parameter key='extra.args' value='-X:gc:ignoreSystemGC=true '/>
<parameter key='extra.rvm.args' value=''/>
<parameter key='processors' value='all'/>
<parameter key='max.opt.level' value=''/>
</parameters>
<test-execution><name>1</name>
<statistics>
<statistic key="time" value="8249"/>
</statistics>
<exit-code>0</exit-code>
<duration>85697</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11828 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8948 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8334 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8094 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8076 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7965 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7740 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7716 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7723 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8249 msec =====
]]></output>
</test-execution><test-execution><name>2</name>
<statistics>
<statistic key="time" value="7630"/>
</statistics>
<exit-code>0</exit-code>
<duration>85269</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11571 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8945 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8403 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8168 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7965 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7816 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7773 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7783 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8210 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 7630 msec =====
]]></output>
</test-execution><test-execution><name>3</name>
<statistics>
<statistic key="time" value="7586"/>
</statistics>
<exit-code>0</exit-code>
<duration>85369</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11819 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8955 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8823 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8058 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7896 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7776 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7813 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8072 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7546 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 7586 msec =====
]]></output>
</test-execution><test-execution><name>4</name>
<statistics>
<statistic key="time" value="7684"/>
</statistics>
<exit-code>0</exit-code>
<duration>86179</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11892 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 9085 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8324 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8128 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8080 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7901 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7908 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8064 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8068 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 7684 msec =====
]]></output>
</test-execution><test-execution><name>5</name>
<statistics>
<statistic key="time" value="8356"/>
</statistics>
<exit-code>0</exit-code>
<duration>85494</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11644 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8760 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8500 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8112 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7969 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7676 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7752 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7605 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8083 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8356 msec =====
]]></output>
</test-execution><test-execution><name>6</name>
<statistics>
<statistic key="time" value="7436"/>
</statistics>
<exit-code>0</exit-code>
<duration>85646</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11879 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 9132 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8320 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8111 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8006 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7800 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7929 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8406 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7532 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 7436 msec =====
]]></output>
</test-execution>
</test>
</group>
</test-configuration>
</results>

<results version="1.1">
<name>dacapo_basel_legacy_1x10</name>
<variant>dacapo_basel_legacy_1x10</variant>
<revision>14775</revision>
<start-time>2009-05-06T18:45:46Z</start-time>
<end-time>2009-05-06T19:01:33Z</end-time>
<parameters>
</parameters>
<host>
<name>shostakovich</name>
<parameters>
</parameters>
</host>
<test-configuration>
<build-configuration>production</build-configuration>
<name>default</name>
<parameters>
<parameter key="mode" value="/home/jan/j14775/dis/production_ia32-linux"/>
<parameter key="mode" value="performance"/>
</parameters>
<group>
<name>perf-dacapo-fop</name>
<test>
<name>fop-3</name>
<command><![CDATA[cd /home/jan/j14775/target/tests/dacapo_basel_legacy_1x10/production/perf-dacapo-fop && /home/jan/j14775/dis/production_ia32-linux/rvm -X:vm:errorsFatal=true -X:processors=all -Xms174M -Xmx174M -X:gc:ignoreSystemGC=true     -classpath "/home/jan/testResources/dacapo/dacapo-2006-10-MR2.jar" Harness -c MMTkCallback -n 3 fop]]></command>
<parameters>
<parameter key='initial.heapsize' value='174'/>
<parameter key='max.heapsize' value='174'/>
<parameter key='time.limit' value='10000'/>
<parameter key='extra.args' value='-X:gc:ignoreSystemGC=true '/>
<parameter key='extra.rvm.args' value=''/>
<parameter key='processors' value='all'/>
<parameter key='max.opt.level' value=''/>
</parameters>
<test-execution><name>1</name>
<statistics>
<statistic key="time" value="2863"/>
</statistics>
<exit-code>0</exit-code>
<duration>9911</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3765 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2872 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2863 msec =====
]]></output>
</test-execution><test-execution><name>2</name>
<statistics>
<statistic key="time" value="2835"/>
</statistics>
<exit-code>0</exit-code>
<duration>9901</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3800 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2937 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2835 msec =====
]]></output>
</test-execution><test-execution><name>3</name>
<statistics>
<statistic key="time" value="2909"/>
</statistics>
<exit-code>0</exit-code>
<duration>9973</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3822 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2930 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2909 msec =====
]]></output>
</test-execution><test-execution><name>4</name>
<statistics>
<statistic key="time" value="2988"/>
</statistics>
<exit-code>0</exit-code>
<duration>10032</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3818 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2912 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2988 msec =====
]]></output>
</test-execution><test-execution><name>5</name>
<statistics>
<statistic key="time" value="2852"/>
</statistics>
<exit-code>0</exit-code>
<duration>9907</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3794 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2934 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2852 msec =====
]]></output>
</test-execution><test-execution><name>6</name>
<statistics>
<statistic key="time" value="2784"/>
</statistics>
<exit-code>0</exit-code>
<duration>9728</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3741 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2892 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2784 msec =====
]]></output>
</test-execution>
</test>
<test>
<name>xalan-3</name>
<command><![CDATA[cd /home/jan/j14775/target/tests/dacapo_basel_legacy_1x10/production/perf-dacapo-fop && /home/jan/j14775/dis/production_ia32-linux/rvm -X:vm:errorsFatal=true -X:processors=all -Xms258M -Xmx258M -X:gc:ignoreSystemGC=true     -classpath "/home/jan/testResources/dacapo/dacapo-2006-10-MR2.jar" Harness -c MMTkCallback -n 3 xalan]]></command>
<parameters>
<parameter key='initial.heapsize' value='258'/>
<parameter key='max.heapsize' value='258'/>
<parameter key='time.limit' value='10000'/>
<parameter key='extra.args' value='-X:gc:ignoreSystemGC=true '/>
<parameter key='extra.rvm.args' value=''/>
<parameter key='processors' value='all'/>
<parameter key='max.opt.level' value=''/>
</parameters>
<test-execution><name>1</name>
<statistics>
<statistic key="time" value="8479"/>
</statistics>
<exit-code>0</exit-code>
<duration>30093</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11808 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8836 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8479 msec =====
]]></output>
</test-execution><test-execution><name>2</name>
<statistics>
<statistic key="time" value="8420"/>
</statistics>
<exit-code>0</exit-code>
<duration>30034</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11676 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8947 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8420 msec =====
]]></output>
</test-execution><test-execution><name>3</name>
<statistics>
<statistic key="time" value="8588"/>
</statistics>
<exit-code>0</exit-code>
<duration>30159</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11624 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8973 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8588 msec =====
]]></output>
</test-execution><test-execution><name>4</name>
<statistics>
<statistic key="time" value="8556"/>
</statistics>
<exit-code>0</exit-code>
<duration>30681</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11911 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 9224 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8556 msec =====
]]></output>
</test-execution><test-execution><name>5</name>
<statistics>
<statistic key="time" value="8315"/>
</statistics>
<exit-code>0</exit-code>
<duration>29836</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11704 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8852 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8315 msec =====
]]></output>
</test-execution><test-execution><name>6</name>
<statistics>
<statistic key="time" value="8437"/>
</statistics>
<exit-code>0</exit-code>
<duration>30471</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11765 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 9320 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8437 msec =====
]]></output>
</test-execution>
</test>
<test>
<name>fop-10</name>
<command><![CDATA[cd /home/jan/j14775/target/tests/dacapo_basel_legacy_1x10/production/perf-dacapo-fop && /home/jan/j14775/dis/production_ia32-linux/rvm -X:vm:errorsFatal=true -X:processors=all -Xms174M -Xmx174M -X:gc:ignoreSystemGC=true     -classpath "/home/jan/testResources/dacapo/dacapo-2006-10-MR2.jar" Harness -c MMTkCallback -n 10 fop]]></command>
<parameters>
<parameter key='initial.heapsize' value='174'/>
<parameter key='max.heapsize' value='174'/>
<parameter key='time.limit' value='10000'/>
<parameter key='extra.args' value='-X:gc:ignoreSystemGC=true '/>
<parameter key='extra.rvm.args' value=''/>
<parameter key='processors' value='all'/>
<parameter key='max.opt.level' value=''/>
</parameters>
<test-execution><name>1</name>
<statistics>
<statistic key="time" value="2663"/>
</statistics>
<exit-code>0</exit-code>
<duration>31331</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3853 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2944 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2859 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3018 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2766 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2778 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3106 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3915 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3040 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2663 msec =====
]]></output>
</test-execution><test-execution><name>2</name>
<statistics>
<statistic key="time" value="3077"/>
</statistics>
<exit-code>0</exit-code>
<duration>29493</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3823 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2920 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2916 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2875 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2682 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2737 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2712 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2715 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2679 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 3077 msec =====
]]></output>
</test-execution><test-execution><name>3</name>
<statistics>
<statistic key="time" value="2669"/>
</statistics>
<exit-code>0</exit-code>
<duration>29506</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3868 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3015 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2910 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2786 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2876 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2779 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2781 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2739 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2736 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2669 msec =====
]]></output>
</test-execution><test-execution><name>4</name>
<statistics>
<statistic key="time" value="2655"/>
</statistics>
<exit-code>0</exit-code>
<duration>31446</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3897 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3045 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2905 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 4118 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2780 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2711 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2862 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2802 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3318 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2655 msec =====
]]></output>
</test-execution><test-execution><name>5</name>
<statistics>
<statistic key="time" value="2718"/>
</statistics>
<exit-code>0</exit-code>
<duration>29275</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3785 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2842 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2811 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2839 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2820 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3050 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2664 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2697 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2709 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 2718 msec =====
]]></output>
</test-execution><test-execution><name>6</name>
<statistics>
<statistic key="time" value="3022"/>
</statistics>
<exit-code>0</exit-code>
<duration>29942</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3837 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3014 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2858 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 3121 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2839 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2750 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2749 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2672 msec =====
===== DaCapo fop starting warmup =====
===== DaCapo fop completed warmup in 2711 msec =====
===== DaCapo fop starting =====
===== DaCapo fop PASSED in 3022 msec =====
]]></output>
</test-execution>
</test>
<test>
<name>xalan-10</name>
<command><![CDATA[cd /home/jan/j14775/target/tests/dacapo_basel_legacy_1x10/production/perf-dacapo-fop && /home/jan/j14775/dis/production_ia32-linux/rvm -X:vm:errorsFatal=true -X:processors=all -Xms258M -Xmx258M -X:gc:ignoreSystemGC=true     -classpath "/home/jan/testResources/dacapo/dacapo-2006-10-MR2.jar" Harness -c MMTkCallback -n 10 xalan]]></command>
<parameters>
<parameter key='initial.heapsize' value='258'/>
<parameter key='max.heapsize' value='258'/>
<parameter key='time.limit' value='10000'/>
<parameter key='extra.args' value='-X:gc:ignoreSystemGC=true '/>
<parameter key='extra.rvm.args' value=''/>
<parameter key='processors' value='all'/>
<parameter key='max.opt.level' value=''/>
</parameters>
<test-execution><name>1</name>
<statistics>
<statistic key="time" value="7754"/>
</statistics>
<exit-code>0</exit-code>
<duration>84837</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11665 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8943 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8460 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8073 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7885 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7703 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7860 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7857 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7611 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 7754 msec =====
]]></output>
</test-execution><test-execution><name>2</name>
<statistics>
<statistic key="time" value="8044"/>
</statistics>
<exit-code>0</exit-code>
<duration>85101</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11603 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8954 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8296 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8229 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8135 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7905 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7840 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7564 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7525 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 8044 msec =====
]]></output>
</test-execution><test-execution><name>3</name>
<statistics>
<statistic key="time" value="7975"/>
</statistics>
<exit-code>0</exit-code>
<duration>85591</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11745 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 9217 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8410 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8370 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7967 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7849 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7790 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7691 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7550 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 7975 msec =====
]]></output>
</test-execution><test-execution><name>4</name>
<statistics>
<statistic key="time" value="7635"/>
</statistics>
<exit-code>0</exit-code>
<duration>84207</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11484 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8965 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8205 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8049 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7778 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7729 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7556 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7780 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7972 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 7635 msec =====
]]></output>
</test-execution><test-execution><name>5</name>
<statistics>
<statistic key="time" value="7281"/>
</statistics>
<exit-code>0</exit-code>
<duration>84401</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11478 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8936 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8496 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8051 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8066 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7829 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7631 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7549 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8060 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 7281 msec =====
]]></output>
</test-execution><test-execution><name>6</name>
<statistics>
<statistic key="time" value="7492"/>
</statistics>
<exit-code>0</exit-code>
<duration>85952</duration>
<result>SUCCESS</result>
<result-explanation></result-explanation>
<output><![CDATA[===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 11821 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8996 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8633 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8208 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7969 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7791 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7822 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 7694 msec =====
===== DaCapo xalan starting warmup =====
Normal completion.
===== DaCapo xalan completed warmup in 8457 msec =====
===== DaCapo xalan starting =====
Normal completion.
===== DaCapo xalan PASSED in 7492 msec =====
]]></output>
</test-execution>
</test>
</group>
</test-configuration>
</results>

------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Looking for Sources of Performance Variation

by Rhodes Brown :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Christian (and others),

I thought it appropriate that I chime in to this discussion. As it
turns out, I'm planning to give a talk next week on variability in
JikesRVM results. While my slides are not yet complete, I've extracted
some of the primary graphs generated from DaCapo and SPEC JVM98 and
posted them at http://webhome.cs.uvic.ca/~rhodesb/research/JikesRVM_Performance.pdf

These results were gathered from JikesRVM 3.0.1 (production
configuration), running on a dual-core Pentium D with Ubuntu Linux
(2.6.24-23-server SMP). The VM heap size was adjusted to 5x the
minimum heap for each benchmark, as identified by Georges, et al. (see
below).

I should note at the outset that my primary interest is not
discovering, quantifying and/or controlling performance. Like many
others, I am working on an addition to Jikes and simply want to be
able to consistently measure the effect of my modifications.
Specifically, I would like to isolate the performance yield of
different compilation strategies from other factors such as GC,
scheduling, etc. I was relieved when I finally stumbled across the
work of Andy Georges and company
(http://doi.acm.org/10.1145/1297027.1297033), which confirmed my
suspicions that in fact the results of running a standard "production"
configuration on DaCapo & JVM98 are often quite unstable.

Indeed, as the results from my slides above indicate, many of the
benchmarks exhibit significant variability from run to run. It is
possible to identify a statistically valid mean "best" score, but this
often requires running well beyond the prescribed number of
iterations. Moreover, there is often a distinct difference between
taking an average of scores within an execution (capturing overall VM
performance) and taking the best score from all repetitions.

Probably the most important finding is that many of these benchmarks
do not "converge" or "stabilize" after some minimum number of
repetitions. In fact, it is quite clear that for some, the chance of
instability actually increases as more adaptive re-compilation is
applied.

My personal recommendation is to ignore the DaCapo & SPEC JVM98
notions of "coefficient of variation" (CoV). It is clear that the idea
works for some of the programs, and simply does not for others. If you
are experimenting with compilation strategies, then it would seem that
identifying the "best" performance requires taking an average over
many executions of many iterations (I have been using at least 10 runs
of 31 repetitions). As pointed out by, Georges & co, anything less has
a strong likelihood of yielding potentially misleading results.

Regards,

Rhodes Brown
Instructor & Ph.D. Candidate in Computer Science
University of Victoria - Victoria, BC, Canada
http://www.cs.uvic.ca

> sunai@... wrote:
>
> This is a troubleshooting question. I am trying to run the dacapo
> benchmarks with an older revision (14775) of Jikes, using the
> ''perf'' test-run (I adapted it to run only the dacapo benchmarks),
> but the measurements turn out to be very unstable. E. g. for dacapo-
> fop running the 9 warum-up + 1 timed iterations for 6 executions
> mostly gives me results within a limited rage (barring 3% of
> variation), but quite a number of measurements (about one fifth) are
> very far off (+>10%). I have attempted a small baseline compiler
> modification to safe some control flow profiling (edge counters);
> when I run the patched VM with this code, all measurements are
> catapulted into the higher ballpark.
>
> I have switched off the AOS recompilation, which apparently also
> ensures that there is no invocation threshold-based recompilation. I
> am running Ubuntu 7.04 in single user mode on a 2-core Intel
> machine. The configuration  is standard (profiled production build
> with classpath, dacapo 2006/10). I have not manually started
> additional system services (not even an Xvfb server to run dacapo-
> chart) and, as stated above, encounter the phenomenon on the out-of-
> the-box RVM.
>
> Right now, I am at a loss what might cause these fluctuations, so I
> am interested in any advice/ideas.
>
> Christian Sinschek,
> Technische Universität Darmstadt

------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Looking for Sources of Performance Variation

by Rhodes Brown :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have been informed that the slides I posted earlier do not display
properly in some versions of Adobe's PDF reader. I seem to have found
a fix and have reposted a new version at the same address:
http://webhome.cs.uvic.ca/~rhodesb/research/JikesRVM_Performance.pdf

For those interested in the statistical data, I collected the
following while regenerating the graphs. The results are over
repetitions 2-11, 12-21, and 22-31 of each execution (the first
repetition is ignored). The number of values sampled is in brackets
[n]. The 'a' value is the arithmetic mean. The 'm' value the median.
The 's' value the standard deviation. I found lusearch prone to
crashing, so more results were gathered for it. Note, as Christian
observed, the variance for fop (and others) is >10% depending on how
you measure.

antlr:
         2-11: [100] a=3251.6, m=3184.5, s=336.136927
        12-21: [100] a=2989.7, m=2919.5, s=306.256151
        22-31: [100] a=2920.7, m=2904.5, s=274.450190
         best:  [10] a=2606.6, m=2605.5, s=30.434629

bloat:
         2-11: [100] a=8776.9, m=8705.5, s=482.546524
        12-21: [100] a=8396.5, m=8427.0, s=257.124797
        22-31: [100] a=8358.3, m=8382.5, s=257.532937
         best:  [10] a=8147.8, m=8143.0, s=213.404571

chart:
         2-11: [100] a=8935.3, m=8905.0, s=132.823365
        12-21: [100] a=8851.0, m=8847.0, s=62.234483
        22-31: [100] a=8859.2, m=8831.5, s=157.648035
         best:  [10] a=8773.3, m=8764.5, s=37.016663

eclipse:
         2-11: [100] a=44795.4, m=44741.0, s=1590.573083
        12-21: [100] a=43536.6, m=43648.0, s=1156.725204
        22-31: [100] a=43182.5, m=43319.5, s=1363.051853
         best:  [10] a=40892.5, m=40705.0, s=539.081575

fop:
         2-11: [100] a=1988.8, m=1926.0, s=222.580491
        12-21: [100] a=1829.0, m=1770.5, s=174.642636
        22-31: [100] a=1867.5, m=1761.5, s=419.558799
         best:  [10] a=1713.6, m=1716.0, s=16.714598

hsqldb:
         2-11: [100] a=2827.3, m=2664.0, s=566.212160
        12-21: [100] a=2457.0, m=2427.5, s=200.933126
        22-31: [100] a=2395.0, m=2338.5, s=330.049919
         best:  [10] a=2228.8, m=2224.5, s=37.389244

jython:
         2-11: [100] a=7245.7, m=7108.0, s=596.781056
        12-21: [100] a=6482.6, m=6459.5, s=189.876568
        22-31: [100] a=6328.3, m=6296.5, s=217.743325
         best:  [10] a=6218.3, m=6235.5, s=90.553913

luindex:
         2-11: [100] a=11123.1, m=11052.0, s=468.323222
        12-21: [100] a=10768.1, m=10705.5, s=344.060171
        22-31: [100] a=10793.2, m=10783.5, s=327.755937
         best:  [10] a=10402.5, m=10399.5, s=88.545343

lusearch:
         2-11: [120] a=4796.4, m=4626.5, s=433.733814
        12-21: [120] a=4563.7, m=4505.0, s=172.024492
        22-31: [120] a=4497.0, m=4476.0, s=167.335591
         best:  [12] a=4385.9, m=4409.5, s=92.450929

pmd:
         2-11: [100] a=5268.2, m=5267.0, s=216.365668
        12-21: [100] a=4979.9, m=4972.5, s=109.359262
        22-31: [100] a=4942.5, m=4931.5, s=121.922934
         best:  [10] a=4808.6, m=4801.0, s=68.839911

xalan:
         2-11: [100] a=6555.6, m=6417.5, s=382.274297
        12-21: [100] a=6105.0, m=6099.0, s=208.984870
        22-31: [100] a=5879.5, m=5867.5, s=139.621307
         best:  [10] a=5731.3, m=5735.5, s=72.657568

compress:
         2-11: [100] a=4492.4, m=4453.5, s=116.976746
        12-21: [100] a=4506.5, m=4475.5, s=123.351381
        22-31: [100] a=4521.9, m=4501.0, s=123.614577
         best:  [10] a=4408.5, m=4390.5, s=59.716460

jess:
         2-11: [100] a=1360.3, m=1349.0, s=48.228056
        12-21: [100] a=1314.4, m=1307.0, s=49.882216
        22-31: [100] a=1306.6, m=1309.0, s=26.932634
         best:  [10] a=1284.8, m=1285.0, s=16.949598

db:
         2-11: [100] a=7855.5, m=7850.0, s=31.798753
        12-21: [100] a=7871.3, m=7865.0, s=43.456202
        22-31: [100] a=7868.2, m=7861.5, s=44.100235
         best:  [10] a=7805.8, m=7801.5, s=17.510632

javac:
         2-11: [100] a=3755.7, m=3701.0, s=252.211086
        12-21: [100] a=3422.6, m=3427.5, s=65.167588
        22-31: [100] a=3334.0, m=3324.0, s=83.624452
         best:  [10] a=3268.0, m=3266.0, s=33.986926

mpegaudio:
         2-11: [100] a=2927.5, m=2933.0, s=45.364823
        12-21: [100] a=2918.1, m=2926.0, s=27.107856
        22-31: [100] a=2940.8, m=2950.0, s=32.512405
         best:  [10] a=2857.7, m=2865.0, s=37.425630

mtrt:
         2-11: [100] a=1122.4, m=1110.0, s=70.875478
        12-21: [100] a=1052.6, m=1050.0, s=38.076503
        22-31: [100] a=1054.0, m=1052.0, s=41.170495
         best:  [10] a=989.0,  m=996.5,  s=21.192242

jack:
         2-11: [100] a=2923.7, m=2912.0, s=62.005226
        12-21: [100] a=2847.2, m=2824.0, s=77.455152
        22-31: [100] a=2812.3, m=2808.5, s=42.403597
         best:  [10] a=2780.6, m=2776.5, s=25.915246

Rhodes Brown
Instructor & Ph.D. Candidate in Computer Science
University of Victoria - Victoria, BC, Canada
http://www.cs.uvic.ca

------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Looking for Sources of Performance Variation

by Ian Rogers (nabble) :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Just to note that some variation may also be possible across builds,
or across builds using different VMs, especially if doing parallel
boot image creation. Note the object layout in the boot image is now
configurable [1]. It's probably not worth looking at SpecJVM'98 too
much due to its now very small execution times. Patches for the
SpecJVM 2008 harness and to fix the RVM to run it are applied in the
MRP source tree [2] (which also as an added bonus now runs on Windows
using BaseBase configurations).

Ian

[1] http://icooolps.loria.fr/icooolps2008/Papers/ICOOOLPS2008_paper04_Rogers_Zhao_Watson_final.pdf
[2] http://mrp.codehaus.org/

------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Looking for Sources of Performance Variation

by Steve Blackburn :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Rhodes,

Thanks for providing your data.  I think this is an important subject  
and I'm glad the mailing list is discussing it! :-)

I took a look through your slides and read your post and have a few  
comments.  I also have some data that I gathered to measure the  
performance of our pending 3.1 release against 3.0.1.


o  I guess you know this, but to be clear: what you've seen has  
nothing to do with Jikes RVM per se, but rather, it is a property of  
modern high performance VMs.   The data below shows that production  
JVMs from Sun and IBM exhibit very similar characteristics.  You see  
similar data in publications, and you'll see it in another form here (http://dacapo.anu.edu.au/regression/perf/2006-10-MR2.html 
) where we continuously track performance for DaCapo, including warm-
up curves.   It is also important to acknowledge that this is not a  
deficiency in the workloads or in the VMs, but rather it reflects the  
way modern applications behave on modern VMs.  Above all, as I show  
below, varying the architecture is crucial.


o  Some of the problems you allude to have been very well thrashed out  
in the literature.  In particular, there's be a lot of work on  
methodology for measuring garbage collection meaningfully, perhaps  
less on methodology for JIT evaluation (and these are quite different  
problems).


o  We should be clear on terminology.  I believe standard terminology  
is that "invocation" is a JVM invocation, and "iteration" is an  
iteration of a benchmark (within a single JVM invocation).


o  I found it a bit hard to see exactly what points you are trying to  
make.  I'm guessing that you're thinking about some of the following:
        a. As a VM iterates over a benchmark, performance will typically  
improve to some asymptotic limit.
        b. Warmup curves for a single invocation do not exhibit monotonic  
improvement; performance goes up and down within a given invocation.
        c. For a given iteration of a given benchmark, performance for a  
given JVM will vary from invocation to invocation.


o   You say:

> Indeed, as the results from my slides above indicate, many of the
> benchmarks exhibit significant variability from run to run. It is
> possible to identify a statistically valid mean "best" score, but this
> often requires running well beyond the prescribed number of
> iterations

It's a bit unclear what you mean here.  It sounds a bit like  
conflation of points a) through c) above.  Further, it is important to  
understand that there is no "prescribed" number of iterations (or  
invocations).  As a researcher you need to design your experiment  
appropriately, and this means choosing such parameters sensibly, given  
your objectives, your context, and your constraints.

So I'm not quite sure what you are getting at.  In my previous post  
I'd mentioned off-hand using the 4th iteration and 20 invocations when  
measuring Jikes RVM's overall performance with the AOS turned on (the  
basis for the 4th iteration was that this is roughly the knee in the  
curve for Jikes RVM's warmup on DaCapo).   So I'm guessing part of  
what you say is in response to that.  To be sure, if I reduced the  
iteration count, then I would be further away from the asymptotically  
best performance.  On the other hand, increasing the iteration count  
would have the opposite effect.   The question is whether this matters  
or not.  The answer is entirely dependent on what it is that you want  
to show.  More importantly, you need to weigh the cost of further  
iterations against what else you could have done with your  
experimental budget (ie the "opportunity cost" of running those extra  
iterations).  You also need to consider what is  
"meaningful" (compiling the exact same program N times in a row is  
perhaps not particularly "meaningful", if it is a goal that the  
evaluation be somehow representative real world workloads---FWIW you  
may want to think about SPECjvm2008 in this light).


o  Your choice of hardware platform can have a dramatic affect, and  
will often dominate over other issues (such as asymptotic performance,  
as an example).   Some machines (such as the Pentium-D and its  
cousins ;-) are notoriously brittle.  In the case of the Pentium-D,  
famously the trace cache can sometimes lead to very counter-intuitive  
results, and more generally, the very deep pipe probably accentuates  
underlying noise.  Running your experiments on multiple machines is  
essential.  Just as an example, in the results below, for pre3.1, on  
antlr, the P4 results have an 95ci of 15.3% while the c2q has 3.5% (ie  
the P4 was 5 X more noisy than the C2Q on that particular benchmark).


o  A small nit with your graphs...  You need to include the origin if  
you don't have normalized data.  Otherwise your data looks very  
exaggerated.  This is a standard gotcha :-)  Either include the zero  
point, or change the y axis so it is normalized.


o  You may find interesting the data I've been gathering to compare  
our pending 3.1 release with 3.0.1.  Since I'd just read your post, I  
went and modified our scripts so that I can produce some warm-up data  
(and ran things through 32 iterations just for this experiment!).  The  
data below is all gathered over 32 iterations of each benchmark and 20  
invocations.  I show means and 95% confidence intervals (expressed as  
a percent of the result).  I've done the measurements on an i7, a core  
2 quad, a pentium 4 and an atom.  I'd normally include a PPC machine  
too, for an entirely different ISA, but that turned out not to be  
convenient when I set off the runs yesterday.  I've included numbers  
for Sun's HotSpot and IBM's J9, each with a stack of performance flags  
turned on (server mode, etc etc).   The data takes time to generate;  
the graphs are incrementally updated with new benchmarks as new data  
becomes available.

- The ostensive reason for these measurements was to compare 3.0.1  
against the pending release.  Compare p4, c2q and i7 results:
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/p4/bmtime.jikes.html
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q/bmtime.jikes.html
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/i7/bmtime.jikes.html

- Warm-up numbers for 4 different JVMs on the c2q (first bar is the  
final iteration, subsequent bars are warm-up iterations):
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q/bmtime.jikessvn-warmup.html
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q/bmtime.jikes301-warmup.html
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q/bmtime.sun-warmup.html
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q/bmtime.ibm-warmup.html
        - Note that Jikes RVM's warm-up profile is fairly similar to hotspot
        - See the 95% CI numbers and see that when a give iteration is  
measure 20 times the result is fairly stable (even more so for later  
iterations).
        - If you look at the same graphs on the P4 you'll see far more  
variation

- Take a look at the 95% CI's across JVMs:
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/p4/bmtime.all.html
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q/bmtime.all.html
        - Jikes' variability is similar to each of the other JVMs.  The  
choice of platform is the biggest factor.

- One way to reduce noise is to turn off the adaptive optimization  
system.  I've done that here and forced everything to be O1 compiled:
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/p4/bmtime.aos.html
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q/bmtime.aos.html
        - Notice how much lower the 95% CI is, particularly on the noisy P4.

- You can browse the data further if you're interested:
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/p4
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/c2q
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/i7
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/atom 
  (data not online yet at time of writing)


o  There have been a quite a few interesting studies of these issues.  
Some particularly interesting work has come from Amer Diwan's group,  
his colleagues and (former) students.  Mike already pointed out one of  
their ASPLOS papers from this year.


o  My take home from all this:
        - There is no simple prescription.  You need to understand your  
system and your hypothesis, and carefully design the experiments to  
suit.
        - Consider the opportunity cost when making a decision: you don't  
have infinite resources, so if something offers diminishing returns  
you need to think very carefully whether your resources would be  
better spent running some other experiment, evaluating a new  
benchmark, etc.   I would never normally run 32 iterations.  I just  
did that this time out of interest.  :-)
        - Architecture (in fact the entire environment) really matters.    
Results from just one machine can be very misleading.
       

Thanks again for raising those interesting issues and sharing your  
slides with us.   The data is really interesting, and this is an  
important discussion to have.

Cheers,

--Steve


------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Looking for Sources of Performance Variation

by Rhodes Brown :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello again all,

I wanted to complete my talk (to the IFIP Software Implementation
Technology Working Group) and get some feedback before following up on
Steve's reply to my original post. For those who are interested, I
have posted my slides with notes at:
http://webhome.cs.uvic.ca/~rhodesb/research/JikesRVM_Performance-Notes.pdf

As Steve noted, the issues I'm raising aren't necessarily specific to
Jikes. Cliff Click confirmed that he'd observed similar behaviors
working with the HotSpot VM. However, the issue of measuring
performance under adaptive optimization is clearly of particular
importance to Jikes researchers--especially those of us who's work
doesn't afford the luxury of being able to turn off the AOS.

Steve's point about normalizing data is well-taken. In my posted
slides, I have included a second axis that shows iteration times
normalized against the best overall time observed for each benchmark.
I've also tried to switch from "repetition" to "iteration" of a
benchmark, to be more in line with the common terminology. However,
I've stuck with "execution" over "invocation" of a VM, since my own
work deals with method invocations and I don't want to confuse the
two.

That said, if we are on the topic of presentation clarity, I'd like to
raise a couple of questions of my own.

First is the use of the geometric mean ("geomean") as an aggregate
measure of performance. I see this all over the place in papers
reporting on Jikes performance, but I have not been able to find a
single one that justifies the use of this mean. John makes a fairly
cogent argument that performance results, in particular speedup
results, should not be summarized with the geomean [1]. Depending on
one's emphasis, a weighted arithmetic or harmonic mean is more
appropriate. It would seem that the geomean is only (arguably)
appropriate in cases where the results exhibit a log-normal
distribution *and* are representative of real workloads--both
debatable points when it comes the commonly used Java benchmarks.

Second, and this is at the core of the point I was trying to make,
what is "bmtime"? Is this total running time for some number of
iterations? The time from a particular iteration, say the last? An
average (mean or median) of iterations within an execution? Does it
include JIT compilation, or is such a question even meaningful?

To be clear, let me try to re-state some of the points I was trying to
make earlier.

My primary intention was to debunk the myth of convergence. Some
benchmarks do, after executing a reasonable number of iterations,
approach a "typical" performance pattern with a CoV less than 0.02.
But some simply do not, regardless of how many executions or
iterations are run (antlr and hsqldb are examples). Some do converge
for some executions, but not others. Some stabilize, but not to the
same performance level. Moreover, many benchmarks actually begin to
de-stabilize when run longer with more time for adaptive optimization.
Thus, while the method suggested by Georges, et al does provide an
appropriate level of rigor, it will not always work and should not be
entirely relied upon. And certainly the rudimentary notions of
convergence built into DaCapo and SPECjvm98 should not be relied upon.

My second point was to emphasize that there is an important
distinction between measuring "typical" performance over a range of
iterations from an execution (as done by Georges, et al), and
measuring the best performance potential of a particular VM
configuration. The latter is appropriate when comparing modifications
that may affect several sequential iterations, as is the case for most
GC strategies. However, identifying the effectiveness of a compilation
strategy is clearly the former. In this case, we are interested in
identifying the maximum potential of the generated code while
discounting other factors. Of course, to be statistically valid, one
must find a mean-best result, not simple take the best overall value.

I would concur with a sentiment seen in several papers on performance
analysis, and echoed by Steve above:
"There is no simple prescription. You need to understand your system
and your hypothesis, and carefully design the experiments to suit."
Indeed, but I would go further: When publishing performance results,
one must choose an approach that is properly aligned with the subject
of one's study (eg. start-up, long-run GC, long-run adaptive
optimization, etc.) *and* present an argument for why the approach is
appropriate. This latter part seems absent from most papers on Jikes
performance that I have read.

As a final point, I think it is worth noting that we (as a community)
have made an inappropriate simplification in treating performance as a
"random" variable. While there are many complicated factors that can
influence performance from platform to platform, and run to run, the
results are still effectively determined by the VM and system
configuration. The true source of randomness is timing. Small and
unpredictable external pressures ultimately lead to executions that
unfold in a bounded, but chaotic fashion. In devising measurement
schemes, we ought to be conscious of this effect and aim to extract
results in a way that is, as much as possible, oblivious to timing
variations. Thus, I would reject methods that report results from a
specific iteration, or aggregates over a fixed interval of time or
iterations.

--
Rhodes H. F. Brown

Instructor & Ph.D. Candidate in Computer Science
University of Victoria - Victoria, BC, Canada
http://www.cs.uvic.ca

References:
[1] L. K. John. Performance Evaluation and Benchmarking, chapter 4:
Aggregating Performance Metrics Over a Benchmark Suite. CRC Press.
2005.

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables
unlimited royalty-free distribution of the report engine
for externally facing server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Looking for Sources of Performance Variation

by Eliot Moss :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

In defense of certain uses of the geo mean ...

A key property of the geo mean is that the geo mean of a
collection of ratios is equal to the geo mean of their
numerators divided by the geo mean of their denominators.
That is, the geo mean of the ratios of pairs of numbers
equals the ratio of the geo means. This suggests that
it is good for dealing with collections of ratios.

Thus:

If I run a benchmark suite of (say) 20 benchmarks, summarizing
the overall performance of the suite using the geo mean of
the individual performance times is sensible ... when desiring
to compare against runing the same suite under some different
condition (say with a new compiler optimization, or on a
different hardware platform).

If one takes the geo mean of the times under the "new" treatment
and divides that by the the geo mean of the times under the
"old" treatment, you get (I claim) a sensible summary of the
*ratios* of the performance of the benchmarks for each
treatment.

A key thing here is that one benchmark may run a lot longer than
another one -- but if I take the ratio of the geo means, that
does not matter, since what I am really computing is the the
performance different as a ratio (i.e., "new" / "old").

The geo mean also tends to prevent an outlier from dominating
the measurement of central tendency. I suppose you can like
that or dislike it, but it is true.

Beyond this, I have found the use of geo mean (on the one
hand) and arithmetic or harmonic mean (one the other hand)
to be an issue argued with religious fervor -- much passion,
not much conversion. I still stand by it as a sensible way to
summarize a benchmark suite's performance, especially for
comparing against other runs of the same suite using ratios.

Best wishes -- Eliot Moss

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables
unlimited royalty-free distribution of the report engine
for externally facing server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Looking for Sources of Performance Variation

by Steve Blackburn :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 16/05/2009, at 10:25 AM, Rhodes Brown wrote:

> My primary intention was to debunk the myth of convergence.


I guess I'm not quite sure what the "myth of convergence" is.   I  
think many, if not most, people are aware that performance of a JVM  
does not always converge to some tightly bounded point within any  
single invocation.   More broadly, the idea of chaotic behavior is  
fairly well established.  Eliot Moss has been describing JVM  
performance in exactly those terms ("chaotic behavior") since I was  
doing a postdoc about 10 years ago.  We took this pretty seriously in  
the context of GC research, because we observed that small  
perturbations in mutator behavior often manifest as huge swings in GC  
performance.   Amer Diwan's group has also looked at this a lot and  
have gone further to note chaotic behavior at the hardware level [1].

I made this point in my previous email:

> b. Warmup curves for a single invocation do not exhibit monotonic
> improvement; performance goes up and down within a given invocation.

I think most people reading this list would agree that observation is  
pretty unremarkable.   This is one of the reasons why we take means  
across a significant number of invocations.  I have not used the per-
invocation convergence tools provided by harnesses such as SPEC and  
DaCapo since that approach is not meaningful in the context of what I  
normally measure (though I assume they are useful to some people).  So  
perhaps there's some debunking to be done surrounding the use of such  
tools.  I don't know.   If that's what you're thinking, then I  
recommend you come at with a working alternative in hand.

Let's look at one of the alternatives approaches: taking the time for  
a given iteration and then averaging that over multiple invocations.    
This is the approach I used in the data I pointed to in the last  
post.  The opaque name "bmtime" just referred to the time the  
benchmark reported on its final (32nd) iteration.  The other pages  
showed warmup data; times for each iteration.

In the case where replay compilation is used (as in our GC work), this  
is fairly straightforward.   For an AOS, there are at least two  
questions: a) which iteration/s to time, and b) whether or not one can  
assume that the average (cross invocation) performance curve is  
monotonic.   If the answer to b) is yes, then it is fairly easy to  
decide what to do (depending on what you're measuring).   If the  
answer to b) is no, then there are at least two conclusions: one  
should try to understand why the systemic perturbations arise, and if  
one cannot remove them, one should mitigate against them  
analytically.  Garbage collection is one such source of systemic  
perturbation, which is one of the reasons why we advocate measuring  
multiple heap sizes.

Clearly if you present results that average particular iterations  
across invocations, one must choose the iterations carefully.   You  
may have noticed that in the jikes rvm 6-hourly performance  
regressions [2], we report 1st, 3rd and 10th iterations, as well as  
numbers for both generous and "tight" heaps.   Now this is not deeply  
principled, but it does give you _some_ insight into the startup  
overhead, the steady state performance, the rate of convergence, and  
the effect of heap pressure.  On the dacapo web site [3] we show the  
same three results and additionally the warmup curves.   I decided to  
include the warmup curves because I believed in point b) above and  
think it is important to see how each of the systems is warming up.

So, if steady state and warmup are important to what it is that you're  
measuring, it is pretty clear that you should explicitly measure (and  
plot) that.  Which is why we do.  And it is clear from those curves  
that some benchmarks and some VMs have particularly chaotic behavior.

Finally, just as I've struggled a bit to nail down what problem it is  
you're pointing to, I'm not quite sure what it is you're proposing by  
way of a solution.


A few other minor notes:

o You might find this data interesting.  Here I measure 49 iterations  
(I only plot every 5th for the sake of space, but can add them all in  
if anyone really wants it), on both hotspot and jikes rvm.   One thing  
to note is that this data is pretty smooth and I don't think there are  
any examples of iteration-to-iteration variance that go outside the  
expected monotonic convergence curve by an amount that is greater than  
the 95% CI of the measured point.   If there are, then that's quite  
interesting (I didn't spot any).

        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/i7wu/bmtime.jikes-warmup.html
        http://cs.anu.edu.au/~Steve.Blackburn/private/results/jikesrvm-performance-2009/i7wu/bmtime.sun-warmup.html

o  In deference to the differing opinions on how to aggregate data, I  
generally include both arithmetic and geometric mean when I publish.  
I also include min and max results---these often get lost and are  
sometimes the most important information.  I've recently started  
taking this further and including (on my web page) all of the raw and  
tabulated data so that other researchers can scrutinize it at will.

o  You make the comment that it is debatable whether benchmarks  
(including dacapo) are representative of real world workloads.  Well  
yes.  No suite can be perfect.  However, the dacapo suite explicitly  
_trys_ to do this by trying to use unmodified source of widely used  
java programs.  Since dacapo is open source, the onus is on you and  
other researchers to provide concrete feedback, better yet, to propose  
better workloads and contribute source.   It is only through  
contributions such as this that the workload stays live and lives up  
to its objective of reflecting real world workloads.   Right now we're  
preparing for a new release.   Aside from contributing new workloads,  
you can help the Jikes RVM research community enormously by  
downloading the source from svn and getting batik, fop, sunflow and  
tomcat working---these are all broken on Jikes RVM [3] but are  
expected to be in the next release).   We plan to drop antlr, bloat,  
chart and hsqldb.

o  I think all of this requires some perspective.  My belief (of  
course I have no data to support it :-) is that if I were to sample  
publications from top-tier venues and critique their findings, many  
may have sub-standard analysis, but I suspect only a few fall to the  
point that their findings are actually false (due to failure of their  
analysis).  However, time and time again, I find results that I  
suspect (and sometimes have gone and verified) are indeed false due to  
more basic methodological failings, such as use of just a single  
hardware platform that happens to significantly bias the result, or  
running at just a single heap size, etc.

o  When researchers publish their source, they allow other researchers  
to confirm their findings.   I would like it very much if members of  
the Jikes RVM community would routinely publish their source along  
with their publication.   Better yet, if the findings are interesting,  
please contribute your outcome to the project.

o  I want to re-iterate the point about opportunity cost, which I  
think is the key to this discussion.  As a researcher, you have a  
finite experimental budget.   You need to chose whether to spend that  
budget on evaluating different heap sizes, more iterations, more  
configurations, etc.  Whatever anyone may wish to say about  
experimental design and methodology, if the approach is not explicitly  
acknowledging opportunity cost, then it is not grounded in reality so  
I'm inclined to read it with a dose of skepticism.

o  Finally, these discussions are healthy.   However, such discussions  
are always better if they're backed by concrete, constructive  
outcomes.  Specifically, I always like to hear concrete suggestions on  
how to resolve concrete problems: "A lot of researchers need to  
measure X.  It seems that the approach is biased or flawed because of  
A, B & C.  Here's a concrete approach for measuring X which addresses  
those shortcomings."  Or, contribute new, more realistic workloads to  
the benchmark suite.   Or, contribute harnesses which help suites such  
as dacapo generate more meaningful results.  In a nutshell: Existing  
methodology is imperfect; we all find it easy to identify flaws.  
Having identified some flaws, we each need to ask what constructive  
thing can we do about it.

Thanks again for contributing your thoughts to this mailing list.   I  
think the discussion will help us all.  It has helped me.

--Steve

PS, in the course of looking at what you had to say, I noticed that  
Jikes RVM was warming up slowly (something Dave had noticed two years  
ago).   So I adjusted the sample rate which gave the whole VM a nice  
little performance kick, just in time for the upcoming 3.1 release :-)  
[2]

[1] http://www.cs.colorado.edu/department/publications/reports/docs/CU-CS-1031-07.pdf
[2] http://jikesrvm.anu.edu.au/cattrack/results/habanero.anu.edu.au/perf/9230/performance_report
[3] http://dacapo.anu.edu.au/regression/perf/head.html


------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables
unlimited royalty-free distribution of the report engine
for externally facing server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

Re: [rvm-research] Looking for Sources of Performance Variation

by Ian Rogers (nabble) :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/5/16 Steve Blackburn <Steve.Blackburn@...>:

>...
> o  Finally, these discussions are healthy.   However, such discussions
> are always better if they're backed by concrete, constructive
> outcomes.  Specifically, I always like to hear concrete suggestions on
> how to resolve concrete problems: "A lot of researchers need to
> measure X.  It seems that the approach is biased or flawed because of
> A, B & C.  Here's a concrete approach for measuring X which addresses
> those shortcomings."  Or, contribute new, more realistic workloads to
> the benchmark suite.   Or, contribute harnesses which help suites such
> as dacapo generate more meaningful results.  In a nutshell: Existing
> methodology is imperfect; we all find it easy to identify flaws.
> Having identified some flaws, we each need to ask what constructive
> thing can we do about it.

One thing that is currently "known" in industry but not tested in
Jikes RVM is that "real VMs" must deal with large cold methods - the
main example being JSP code. Given this is a performance pathology for
Jikes RVM (it must always baseline compile, carry round GC maps, etc.)
it's quite telling that it's not had to care about it because no major
benchmark suite has these methods in. Another example is DaCapo not
touching any Java 5 features and thereby not justifying any
optimizations on generics. SPECjvm2008 does a better job of using Java
5 features, but the JSP problem remains open for benchmark
contributions.

Regards,
Ian
--
MRP == More Research Please, run SPECjvm2008 on a Jikes RVM based source base

------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises
looking to deploy the next generation of Solaris that includes the latest
innovations from Sun and the OpenSource community. Download a copy and
enjoy capabilities such as Networking, Storage and Virtualization.
Go to: http://p.sf.net/sfu/opensolaris-get
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers