MP safe syscall changes

View: New views
2 Messages — Rating Filter:   Alert me  

MP safe syscall changes

by Andrew Doran-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I just made a bunch more syscalls MP safe on the newlock2 branch, and ran a
quick test on this machine: 4 x 700MHz P-III Xeon, 1MB L2 cache per CPU, 2GB
RAM. Cleaning out a clean source tree with make -j16 cleandir:

                real            user            system

MP, newlock2    103.55s         174.51s         246.26s
MP, HEAD        114.21s         175.46s         287.01s
UP, newlock2    198.34s         152.67s         61.30s
UP, HEAD        199.18s         150.60s         59.20s

The results aren't astounding and are not particularly accurate (system and
user time are subject to sampling error), but the indicated difference
relative to HEAD is:

                real            system

MP              -9%             -14%
UP              -0.5%           +3.5%

At a guess, part of the additional two seconds system time in the UP case is
a result of all the jumping through hoops that needs to be done to support
both locking against interrupt handlers and LWPs. With interrupts as threads
that cost can be eliminated..

Cheers,
Andrew

Re: MP safe syscall changes

by Perry E. Metzger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Andrew Doran <ad@...> writes:
> I just made a bunch more syscalls MP safe on the newlock2 branch, and ran a
> quick test on this machine: 4 x 700MHz P-III Xeon, 1MB L2 cache per CPU, 2GB
> RAM. Cleaning out a clean source tree with make -j16 cleandir:

I'm happy (and surprised) that there is a speedup here. Consider that
this is not an embarrassingly parallel task: removing lots of files is
ultimately bottlenecked on the disk and not something where we would
expect much parallel speedup.

>                 real            system
>
> MP              -9%             -14%
> UP              -0.5%           +3.5%

I would be quite interested in seeing figures for building the world
in both the current MP and newlock2 MP cases. *That* might have some
interesting opportunities for parallelism on a machine with 2G of
ram. Do knock up the number of vnodes to around 2^17 or 2^18th before
starting, though, or you might end up a bit i/o bound with that much
parallelism.

Perry