Analysis on make parallelism for buildworld

View: New views
1 Messages — Rating Filter:   Alert me  

Analysis on make parallelism for buildworld

by Simon 'corecode' Schubert :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hey,

the question on which make parallelism to use comes up repeatedly.  However the answer usually is driven by anecdotal evidence and not by empirical data.  To this end, I ran a small benchmark test to add one data point.  I have no idea about confidence intervals, so somebody will have to chime in here.

Experimental setup
==================

Machine: Dell Precision T3400
CPU: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz (2826.24-MHz 686-class CPU)
Memory: avail memory = 2063409152 (2015048K bytes)
HDD: da0: <SATA Hitachi HDP72505 GM4O> Fixed Direct Access SCSI-4 device (via AHCI)
filesystem: HAMMER v2
/usr/src: v2.5.1-77-gd894b0e
/usr/obj: flags nohistory, nullfs mount

executed command: make -j $j_level buildworld buildkernel

make levels used: 1-10
repetitions: 5

There were no other tasks performed during the tests, although Xorg, windowmaker, terminals, xmms, firefox and thunderbird were running (idling).  Standard background jobs were not disabled.


Discussion
==========
The plot shows the median build time as line and the errorbars show the min/max build times.  The max spike at -j4 is probably due to it running concurrently with the 3am hammer cleanup.

We can see a monotonic drop in total run time from -j1 to -j5.  After that the run time plateaus.  User and sys times increase at the same time, also plateauing beyond -j5.  This shows that increased parallelism in make will add slightly to the total overhead (sys+user), but total run time is significantly reduced.  Beyond -j ncpu+1 we can not see any improvement in run time.

A -j 2 build does not offer significant benefit over -j 1, which is not intuitive and might need some further investigation.

The -j 5 build achieves a 42% reduction in build time, respective to the -j 1 base line.

Compared to the -j 4 (i.e. -j ncpu) build, the -j 5 (i.e. -j ncpu+1) build reduces run time by an additional 5.4%.  This shows that not all CPU cores can be kept busy if there is only a parallelism level of ncpu.


Conclusion
==========

I advise to run builds at -j ncpu+1 for 4-cpu systems.  Until we have numbers for 2-cpu and UP systems, we can not provide conclusive advice, however I would try using -j3 for those two cases.


cheers
  simon


make-j-runtimes.png (6K) Download Attachment