mv performance

View: New views
1 Messages — Rating Filter:   Alert me  

mv performance

by painter :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'm a new ATLAS user and tried version 3.8.2.  I couldn't get the reference level of performance for double precision matrix-vector multiply (no transpose), the part of most interest to me.  Should I be surprised?   What should I be doing to improve it?

Here's the "make time" output:

Reference clock rate=1597Mhz, new rate=2390Mhz
   Refrenc : % of clock rate achieved by reference install
   Present : % of clock rate achieved by present ATLAS install

                    single precision                  double precision
            ********************************   *******************************
                  real           complex           real           complex
            ---------------  ---------------  ---------------  ---------------
Benchmark   Refrenc Present  Refrenc Present  Refrenc Present  Refrenc Present
=========   ======= =======  ======= =======  ======= =======  ======= =======
  kSelMM      346.9   369.1    337.9   368.7    181.7   184.5    180.6   177.0
  kGenMM      167.6   177.2    179.0   158.0    159.0   152.7    153.0   158.0
  kMM_NT      126.4   126.6    137.7   134.2    105.8   100.7    116.7   119.7
  kMM_TN      151.2   134.2    156.7   142.9    124.1   116.6    125.4   134.2
  BIG_MM      325.3   336.0    319.8   333.6    171.1   177.6    168.3   174.0
   kMV_N       50.5    45.2     96.7    93.4     48.2    38.3     91.1    55.9
   kMV_T       54.3    56.0     63.0    62.1     32.0    30.3     49.8    43.3
    kGER       39.3    44.5     69.9    71.3     20.9    22.4     44.3    39.7

The computer has (from /proc/cpuinfo): 2 cpus, each
  AMD Opteron Processor 250, 2390 MHz, cache 1024 KB
It runs Red Hat Enterprise Linux 3.  There are multiple users.
The standard gcc on this system is version 3.2.3 but I downloaded version 4.2.4 and used this configure line:
../configure --prefix=$HOME/ATLAS/atlas.a4 -Fa alg -fPIC -Ss kern $HOME/local/gcc-4.2.4/bin/cc
I installed the new version of Make.mvtune.
The automatic compuation of CacheEdge doesn't work consistently in this multi-user environment, so
I set it to 768KB.  According to my experiments running xfindCE, the value doesn't matter much on one processor.

If you've gotten this far, thank you for your attention!

- Jeff Painter