« Return to Thread: [rvm-research] Looking for Sources of Performance Variation

Re: [rvm-research] Looking for Sources of Performance Variation

by Rhodes Brown :: Rate this Message:

Reply to Author | View in Thread

Christian (and others),

I thought it appropriate that I chime in to this discussion. As it
turns out, I'm planning to give a talk next week on variability in
JikesRVM results. While my slides are not yet complete, I've extracted
some of the primary graphs generated from DaCapo and SPEC JVM98 and
posted them at http://webhome.cs.uvic.ca/~rhodesb/research/JikesRVM_Performance.pdf

These results were gathered from JikesRVM 3.0.1 (production
configuration), running on a dual-core Pentium D with Ubuntu Linux
(2.6.24-23-server SMP). The VM heap size was adjusted to 5x the
minimum heap for each benchmark, as identified by Georges, et al. (see
below).

I should note at the outset that my primary interest is not
discovering, quantifying and/or controlling performance. Like many
others, I am working on an addition to Jikes and simply want to be
able to consistently measure the effect of my modifications.
Specifically, I would like to isolate the performance yield of
different compilation strategies from other factors such as GC,
scheduling, etc. I was relieved when I finally stumbled across the
work of Andy Georges and company
(http://doi.acm.org/10.1145/1297027.1297033), which confirmed my
suspicions that in fact the results of running a standard "production"
configuration on DaCapo & JVM98 are often quite unstable.

Indeed, as the results from my slides above indicate, many of the
benchmarks exhibit significant variability from run to run. It is
possible to identify a statistically valid mean "best" score, but this
often requires running well beyond the prescribed number of
iterations. Moreover, there is often a distinct difference between
taking an average of scores within an execution (capturing overall VM
performance) and taking the best score from all repetitions.

Probably the most important finding is that many of these benchmarks
do not "converge" or "stabilize" after some minimum number of
repetitions. In fact, it is quite clear that for some, the chance of
instability actually increases as more adaptive re-compilation is
applied.

My personal recommendation is to ignore the DaCapo & SPEC JVM98
notions of "coefficient of variation" (CoV). It is clear that the idea
works for some of the programs, and simply does not for others. If you
are experimenting with compilation strategies, then it would seem that
identifying the "best" performance requires taking an average over
many executions of many iterations (I have been using at least 10 runs
of 31 repetitions). As pointed out by, Georges & co, anything less has
a strong likelihood of yielding potentially misleading results.

Regards,

Rhodes Brown
Instructor & Ph.D. Candidate in Computer Science
University of Victoria - Victoria, BC, Canada
http://www.cs.uvic.ca

> sunai@... wrote:
>
> This is a troubleshooting question. I am trying to run the dacapo
> benchmarks with an older revision (14775) of Jikes, using the
> ''perf'' test-run (I adapted it to run only the dacapo benchmarks),
> but the measurements turn out to be very unstable. E. g. for dacapo-
> fop running the 9 warum-up + 1 timed iterations for 6 executions
> mostly gives me results within a limited rage (barring 3% of
> variation), but quite a number of measurements (about one fifth) are
> very far off (+>10%). I have attempted a small baseline compiler
> modification to safe some control flow profiling (edge counters);
> when I run the patched VM with this code, all measurements are
> catapulted into the higher ballpark.
>
> I have switched off the AOS recompilation, which apparently also
> ensures that there is no invocation threshold-based recompilation. I
> am running Ubuntu 7.04 in single user mode on a 2-core Intel
> machine. The configuration  is standard (profiled production build
> with classpath, dacapo 2006/10). I have not manually started
> additional system services (not even an Xvfb server to run dacapo-
> chart) and, as stated above, encounter the phenomenon on the out-of-
> the-box RVM.
>
> Right now, I am at a loss what might cause these fluctuations, so I
> am interested in any advice/ideas.
>
> Christian Sinschek,
> Technische Universität Darmstadt

------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Jikesrvm-researchers mailing list
Jikesrvm-researchers@...
https://lists.sourceforge.net/lists/listinfo/jikesrvm-researchers

 « Return to Thread: [rvm-research] Looking for Sources of Performance Variation