« Return to Thread: k10h post-BIOS patch effects

Re: k10h post-BIOS patch effects

by Clint Whaley :: Rate this Message:

Reply to Author | View in Thread

Dean (& guys),

OK, here are a few things.  First, there is a modified xdfc available at:
   www.cs.utsa.edu/~whaley/dload/xdfc
It is my normal kernel timer, which has been modified to keep calling the K10h
kernel 1K times.  When I run this on my Phenom, most of the numbers are roughly
8Gflop, but then it drops to 4Gflop for a lot of them.  Can anyone with a
Phenom run this executable and make sure yours doesn't do this to you too
(i.e. the perf drop happens rarely enough that it can be missed)?

If this executable won't work for you (eg., different libraries) you can
make create it yourself by changing line 634 of ATLAS/tune/blas/gemm/fc.c from:
   #define NSAMPLE 3
to:
   #define NSAMPLE 1024

And then issuing (in $BLDdir/tune/blas/gemm):
   make ummcase pre=d DMCFLAGS="-x assembler-with-cpp" \
        mmrout=CASES/ATL_dmm8x1x120_L1pf.c nb=40

>try this:
>setpci -d 1022:1204 64.l

bit 0 was zero for me :(

>setpci -d 1022:1204 64.l=0

did this (despite above), and ./xdfc still behaves same way

>> Yeah, but I did not expect that massive die-off for a cache-dominated
>> algorithm like GEMM.  For HPC, the slowdown is massive and pervasive,
>> but the TLB bug is triggered daily.
>
>are you sure it's the TLB bug?  in lots of testing i've never tripped the
>erratum 298 problem.

No, but we were debugging a lot of large parallel DGEMMs, and the machine was
dying roughly once a day.  I applied the patch, and the machine has been
stable since, so I just assumed.  However, it could have been something
we are doing differently in our testing (as the code has changes), or an
unrelated other thing in the BIOS . . .

if you want to experiment with the workarounds, build
http://code.google.com/p/iotools/ and put it into your PATH.

>if you want to experiment with the workarounds, build
>http://code.google.com/p/iotools/ and put it into your PATH.
>
>then execute a script something like this:
>
>for cpu in `awk '/^processor/ {print $3}' /proc/cpuinfo`; do
>        # disable erratum 298 workaround
>        wrmsr $cpu 0xc0010015 $(and $(rdmsr $cpu 0xc0010015) $(not $(shl 1 3)))
>        wrmsr $cpu 0xc0011023 $(and $(rdmsr $cpu 0xc0011023) $(not $(shl 1 1)))
>
>        # disable erratum 309 workaround
>        wrmsr $cpu 0xc0011023 $(and $(rdmsr $cpu 0xc0011023) $(not $(shl 1 23)))
>done

This program allows you to change stuff in the BIOS on the fly?  Or is this
linux workarounds I need to be able to apply with an unpatched BIOS?  I
guess I need to check it out with savana & compile it (I didn't see any
simple download link)?

Thanks,
Clint

**************************************************************************
** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley **
**************************************************************************

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

 « Return to Thread: k10h post-BIOS patch effects