3.9.10

View: New views
11 Messages — Rating Filter:   Alert me  

3.9.10

by Clint Whaley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Guys,

I've finally gotten 3.9.10 out.  This is the first release of the new threaded
system that passes a boatload of tests on the sicortex machine, which with
its non-x86 processors, and especially its non-power-of-2 # or processors,
turned out to be a great platform for finding my bugs and bad assumptions.
It's also got a fix for windows/shared lib building, and a small makefile
fix.  ChangeLog is below.

Cheers,
Clint

ATLAS 3.9.10 released 03/11/09, changes from 3.9.9
   * Rewrote tgemm's combine routine to work on arbitrary partitionings
     combined in arbitrary orders (necessary for non-power-of-2 processors)
     - Restricted fix for SYRK (not general, as it isn't needed yet)
   * Fixed bug in EnforceNonPwr2LO caused by failure to rename moved
     structure in the Cinfp array
   * Fixed makefile problem that caused ATLAS to re-archive the L3BLAS for
     every tester compile
   * On windows, added -lkernel32 to LIBS macro to enable shared lib build


**************************************************************************
** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley **
**************************************************************************

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

Re: 3.9.10

by M. Edward (Ed) Borasky :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Mar 11, 2009 at 6:55 PM, Clint Whaley <whaley@...> wrote:

> Guys,
>
> I've finally gotten 3.9.10 out.  This is the first release of the new threaded
> system that passes a boatload of tests on the sicortex machine, which with
> its non-x86 processors, and especially its non-power-of-2 # or processors,
> turned out to be a great platform for finding my bugs and bad assumptions.
> It's also got a fix for windows/shared lib building, and a small makefile
> fix.  ChangeLog is below.
>
> Cheers,
> Clint
I have some numbers from my 4 GB Athlon64 X2 2.2 GHz. System is
running openSUSE 11.1 with GCC 4,3,2:

> uname -a
Linux DreamScape 2.6.27.19-3.2-default #1 SMP 2009-02-25 15:40:44
+0100 x86_64 x86_64 x86_64 GNU/Linux
> gcc --version
gcc (SUSE Linux) 4.3.2 [gcc-4_3-branch revision 141291]

--
M. Edward (Ed) Borasky
http://www.linkedin.com/in/edborasky

I've never met a happy clam. In fact, most of them were pretty steamed.

[atlas-time.log]

gcc -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//CONFIG/include  -g -w -c /home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//CONFIG/src/atlbench.c
gcc -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//CONFIG/include  -g -w -c /home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//CONFIG/src/atlconf_misc.c
gcc -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//CONFIG/include  -g -w -o xatlbench atlbench.o atlconf_misc.o
atlconf_misc.o: In function `CmndResults':
/home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//CONFIG/src/atlconf_misc.c:306: warning: the use of `tmpnam' is dangerous, better use `mkstemp'
make -f Make.top time
make[1]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build'
./xatlbench -dc /home/Projects/linux_perf_viz/build-scripts/atlas_build/bin/INSTALL_LOG -dp /home/Projects/linux_perf_viz/build-scripts/atlas_build/ARCHS/HAMMER64SSE3
make[2]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/bin'
gcc -o sgemmtst_big.o -c -DL2SIZE=4194304 -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/include -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//include -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//include/contrib -DAdd_ -DF77_INTEGER=int -DStringSunStyle -DATL_OS_Linux -DATL_ARCH_HAMMER -DATL_CPUMHZ=2200 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_3DNow -DATL_USE64BITS -DATL_GAS_x8664  -DATL_FULL_LAPACK -DATL_NCPU=2 -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -m64 -DSREAL -DTRUST_BIG   /home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//bin/gemmtst.c
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil ; make lib
make[3]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
make[3]: Nothing to be done for `lib'.
make[3]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm ; make slib
make[3]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make auxillib scleanuplib susergemm
make[4]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil ; make lib
make[5]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
make[5]: Nothing to be done for `lib'.
make[5]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
cd KERNEL ; make -f sMakefile slib
make[5]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/KERNEL'
make[5]: Nothing to be done for `slib'.
make[5]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/KERNEL'
make[4]: Nothing to be done for `susergemm'.
make[4]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make -j 2 slib.grd
make[4]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make[4]: `slib.grd' is up to date.
make[4]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make[3]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing ; make slib
make[3]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
make -j 2 slib.grd
make[4]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
make[4]: `slib.grd' is up to date.
make[4]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
make[3]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
gfortran -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -m64 -o xsmmtst_big sgemmtst_big.o \
                   /home/Projects/linux_perf_viz/build-scripts/atlas_build/lib/libtstatlas.a /home/Projects/linux_perf_viz/build-scripts/atlas_build/lib/libf77refblas.a /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_sbig_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_sbignork_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_ssmall_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_ssmallK_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/lib/libatlas.a -lpthread -lm
make[2]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/bin'
res+off=7612.9  1.00   ---

BIG_MM N=1600, mf=7612.90,7641.30!
make[2]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/bin'
gcc -o cgemmtst_big.o -c -DL2SIZE=4194304 -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/include -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//include -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//include/contrib -DAdd_ -DF77_INTEGER=int -DStringSunStyle -DATL_OS_Linux -DATL_ARCH_HAMMER -DATL_CPUMHZ=2200 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_3DNow -DATL_USE64BITS -DATL_GAS_x8664  -DATL_FULL_LAPACK -DATL_NCPU=2 -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -m64 -DSCPLX -DTRUST_BIG   /home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//bin/gemmtst.c
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil ; make lib
make[3]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
make[3]: Nothing to be done for `lib'.
make[3]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm ; make clib
make[3]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make auxillib ccleanuplib cusergemm
make[4]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil ; make lib
make[5]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
make[5]: Nothing to be done for `lib'.
make[5]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
cd KERNEL ; make -f cMakefile clib
make[5]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/KERNEL'
make[5]: Nothing to be done for `clib'.
make[5]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/KERNEL'
make[4]: Nothing to be done for `cusergemm'.
make[4]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make -j 2 clib.grd
make[4]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make[4]: `clib.grd' is up to date.
make[4]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make[3]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing ; make clib
make[3]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
make -j 2 clib.grd
make[4]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
make[4]: `clib.grd' is up to date.
make[4]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
make[3]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
gfortran -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -m64 -o xcmmtst_big cgemmtst_big.o \
                   /home/Projects/linux_perf_viz/build-scripts/atlas_build/lib/libtstatlas.a /home/Projects/linux_perf_viz/build-scripts/atlas_build/lib/libf77refblas.a /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_cbig_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_csmall_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_csmallK_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/lib/libatlas.a -lpthread -lm
make[2]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/bin'
res+off= 7474.0 1.00   ---

BIG_MM N=1600, mf=7474.00,7460.40!
make[2]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/bin'
gcc -o dgemmtst_big.o -c -DL2SIZE=4194304 -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/include -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//include -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//include/contrib -DAdd_ -DF77_INTEGER=int -DStringSunStyle -DATL_OS_Linux -DATL_ARCH_HAMMER -DATL_CPUMHZ=2200 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_3DNow -DATL_USE64BITS -DATL_GAS_x8664  -DATL_FULL_LAPACK -DATL_NCPU=2 -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -m64 -DDREAL -DTRUST_BIG   /home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//bin/gemmtst.c
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil ; make lib
make[3]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
make[3]: Nothing to be done for `lib'.
make[3]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm ; make dlib
make[3]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make auxillib dcleanuplib dusergemm
make[4]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil ; make lib
make[5]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
make[5]: Nothing to be done for `lib'.
make[5]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
cd KERNEL ; make -f dMakefile dlib
make[5]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/KERNEL'
make[5]: Nothing to be done for `dlib'.
make[5]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/KERNEL'
make[4]: Nothing to be done for `dusergemm'.
make[4]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make -j 2 dlib.grd
make[4]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make[4]: `dlib.grd' is up to date.
make[4]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make[3]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing ; make dlib
make[3]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
make -j 2 dlib.grd
make[4]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
make[4]: `dlib.grd' is up to date.
make[4]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
make[3]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
gfortran -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -m64 -o xdmmtst_big dgemmtst_big.o \
                   /home/Projects/linux_perf_viz/build-scripts/atlas_build/lib/libtstatlas.a /home/Projects/linux_perf_viz/build-scripts/atlas_build/lib/libf77refblas.a /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_dbig_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_dbignork_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_dsmall_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_dsmallK_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/lib/libatlas.a -lpthread -lm
make[2]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/bin'
res+off=3878.5  1.00   ---

BIG_MM N=1600, mf=3878.50,3856.60!
make[2]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/bin'
gcc -o zgemmtst_big.o -c -DL2SIZE=4194304 -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/include -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//include -I/home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//include/contrib -DAdd_ -DF77_INTEGER=int -DStringSunStyle -DATL_OS_Linux -DATL_ARCH_HAMMER -DATL_CPUMHZ=2200 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_3DNow -DATL_USE64BITS -DATL_GAS_x8664  -DATL_FULL_LAPACK -DATL_NCPU=2 -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -m64 -DDCPLX -DTRUST_BIG   /home/Projects/linux_perf_viz/build-scripts/atlas_build/../ATLAS//bin/gemmtst.c
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil ; make lib
make[3]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
make[3]: Nothing to be done for `lib'.
make[3]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm ; make zlib
make[3]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make auxillib zcleanuplib zusergemm
make[4]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil ; make lib
make[5]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
make[5]: Nothing to be done for `lib'.
make[5]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/auxil'
cd KERNEL ; make -f zMakefile zlib
make[5]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/KERNEL'
make[5]: Nothing to be done for `zlib'.
make[5]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/KERNEL'
make[4]: Nothing to be done for `zusergemm'.
make[4]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make -j 2 zlib.grd
make[4]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make[4]: `zlib.grd' is up to date.
make[4]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
make[3]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm'
cd /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing ; make zlib
make[3]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
make -j 2 zlib.grd
make[4]: Entering directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
make[4]: `zlib.grd' is up to date.
make[4]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
make[3]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/src/testing'
gfortran -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -m64 -o xzmmtst_big zgemmtst_big.o \
                   /home/Projects/linux_perf_viz/build-scripts/atlas_build/lib/libtstatlas.a /home/Projects/linux_perf_viz/build-scripts/atlas_build/lib/libf77refblas.a /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_zbig_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_zsmall_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/src/blas/gemm/ATL_zsmallK_mm.o /home/Projects/linux_perf_viz/build-scripts/atlas_build/lib/libatlas.a -lpthread -lm
make[2]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build/bin'
res+off= 3865.7 1.00   ---

BIG_MM N=1600, mf=3865.70,3874.90!

The times labeled Reference are for ATLAS as installed by the authors.
NAMING ABBREVIATIONS:
   kSelMM : selected matmul kernel (may be hand-tuned)
   kGenMM : generated matmul kernel
   kMM_NT : worst no-copy kernel
   kMM_TN : best no-copy kernel
   BIG_MM : large GEMM timing (usually N=1600); estimate of asymptotic peak
   kMV_N  : NoTranspose matvec kernel
   kMV_T  : Transpose matvec kernel
   kGER   : GER (rank-1 update) kernel
Kernel routines are not called by the user directly, and their
performance is often somewhat different than the total
algorithm (eg, dGER perf may differ from dkGER)


Reference clock rate=2200Mhz, new rate=2200Mhz
   Refrenc : % of clock rate achieved by reference install
   Present : % of clock rate achieved by present ATLAS install

                    single precision                  double precision
            ********************************   *******************************
                  real           complex           real           complex
            ---------------  ---------------  ---------------  ---------------
Benchmark   Refrenc Present  Refrenc Present  Refrenc Present  Refrenc Present
=========   ======= =======  ======= =======  ======= =======  ======= =======
  kSelMM      354.2   358.7    340.0   345.9    163.8   182.4    178.2   182.4
  kGenMM      183.1   170.8    154.6   144.9    163.8   158.3    168.6   170.8
  kMM_NT      135.5   139.8    145.4   145.4    112.6   128.1    131.0   138.0
  kMM_TN      153.3   158.3    141.4   158.3    131.1   147.4    144.8   143.5
  BIG_MM      337.6   347.3    328.7   339.7    159.1   176.3    171.0   176.1
   kMV_N       53.8    54.0    139.2   143.7     36.2    37.2     73.1    75.7
   kMV_T       62.2    61.1     72.8    79.1     33.6    34.5     52.6    54.1
    kGER       45.6    48.6     90.8   101.2     23.7    26.6     47.5    54.2
make[1]: Leaving directory `/home/Projects/linux_perf_viz/build-scripts/atlas_build'


------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

Re: 3.9.10

by Clint Whaley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ed,

>I have some numbers from my 4 GB Athlon64 X2 2.2 GHz. System is
>running openSUSE 11.1 with GCC 4,3,2:

The main difference in timings is in average-case threaded performance, as shown:
   http://math-atlas.sourceforge.net/timing/newThr395/index.html

Make time presently reports only serial timings, so I wouldn't expect
anything to really show up there (I probably need to make a pttime target).

Cheers,
Clint

**************************************************************************
** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley **
**************************************************************************

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

Re: 3.9.10

by znmeb :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Meanwhile ... I am running into something on both my Turion64 X2
laptop and Athlon64 X2 desktop. Both of them build successfully
"eventually", but they are taking a very long time in the make step. I
haven't actually sat down and figured out how many hours this runs,
but I've got them both running now and the Athlon64 2.2 GHz is showing
that xclantst_pt has consumed 86 minutes of CPU time. The Turion64 2.1
GHz is showing xzlanbtst with 87 minutes of CPU time. Here's the build
script:

#! /bin/bash -v

# download LAPACK 3.1.1 source
rm -fr lapack-lite*
wget http://netlib.org/lapack/lapack-lite-3.1.1.tgz
tar xf lapack-lite-3.1.1.tgz

# download source
export WHERE='http://voxel.dl.sourceforge.net/sourceforge/math-atlas'
export WHAT='atlas3.9.10.tar.bz2'
export DIR='ATLAS'
rm -fr ${WHAT} ${DIR}
wget ${WHERE}/${WHAT}
tar xf ${WHAT}

# make sure the hardware is set correctly
sudo cpufreq-selector -g performance -c 0
sudo cpufreq-selector -g performance -c 1
cpufreq-info | tee cpufreq-info.log

# now do the build
rm atlas*log # ditch log files
rm -fr atlas_build; mkdir atlas_build; # make a clean directory
cd atlas_build
../${DIR}/configure -Ss lasrc ../lapack-lite-3.1.1/SRC -Si latune 1 \
  > ../atlas-configure.log 2>&1
make > ../atlas-make.log 2>&1
make check > ../atlas-check.log 2>&1
make ptcheck > ../atlas-ptcheck.log 2>&1
make time > ../atlas-time.log 2>&1

# install
sudo make install

cd ..
sudo /sbin/ldconfig

Am I missing something, like a parameter that tells it to use
predefined setups for the processors?
--
M. Edward (Ed) Borasky
http://www.linkedin.com/in/edborasky

I've never met a happy clam. In fact, most of them were pretty steamed.

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

Re: 3.9.10

by Clint Whaley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ed,

>Meanwhile ... I am running into something on both my Turion64 X2
>laptop and Athlon64 X2 desktop. Both of them build successfully
>"eventually", but they are taking a very long time in the make step. I
>haven't actually sat down and figured out how many hours this runs,
>but I've got them both running now and the Athlon64 2.2 GHz is showing
>that xclantst_pt has consumed 86 minutes of CPU time. The Turion64 2.1
>GHz is showing xzlanbtst with 87 minutes of CPU time. Here's the build
>script:

OK, the first thing is that when you throw the '-Si latune 1' flag to configure,
and I haven't provided lapack architectural defaults, you might as well
kick back while the install runs, and runs, and runs . . .   :)

It is expected that these QR tunings will take hours on an x86 (and they
take days on a MIPS, for instance).  Right now the tuning is completely BFI,
and I'd have to spend a lot of time to see how to improve it, and I'm not
sure how much improvement we would see.  Therefore, I am not concentrating
on improving this right now, so for the forseeable future if you add latune 1
w/o arch defs, you can expect a loooooong install (that is why latune is not
on by default).

So, now your question can be reduced to: why don't I have lapack arch defs
for these architectures?  So far, I have provided them for only a small
subset of machines that I use every day.  I have no access to any Turion procs,
so that guy is obviously out.  I believe 3.9.10's HAMMER64SSE3's arch defs
should have the lapack arch defs, which means that it should spend no time in
lanbtst.  What is the ARCH string that ATLAS configures your machine as
(for instance, the 32 bit of the above, or the SSE2 version, does not have
lapack arch defs)

Cheers,
Clint

**************************************************************************
** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley **
**************************************************************************

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

Re: 3.9.10

by znmeb :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Mar 21, 2009 at 8:48 AM, Clint Whaley <whaley@...> wrote:
> OK, the first thing is that when you throw the '-Si latune 1' flag to configure,
> and I haven't provided lapack architectural defaults, you might as well
> kick back while the install runs, and runs, and runs . . .   :)

> It is expected that these QR tunings will take hours on an x86 (and they
> take days on a MIPS, for instance).  Right now the tuning is completely BFI,
> and I'd have to spend a lot of time to see how to improve it, and I'm not
> sure how much improvement we would see.  Therefore, I am not concentrating
> on improving this right now, so for the forseeable future if you add latune 1
> w/o arch defs, you can expect a loooooong install (that is why latune is not
> on by default).

The whole goal of this is to do some in-core SVDs, so I think I need
to do the latune.

> So, now your question can be reduced to: why don't I have lapack arch defs
> for these architectures?  So far, I have provided them for only a small
> subset of machines that I use every day.  I have no access to any Turion procs,
> so that guy is obviously out.  I believe 3.9.10's HAMMER64SSE3's arch defs
> should have the lapack arch defs, which means that it should spend no time in
> lanbtst.  What is the ARCH string that ATLAS configures your machine as
> (for instance, the 32 bit of the above, or the SSE2 version, does not have
> lapack arch defs)

The Turion is off the network at the moment, but the Athlon64 X2 log
file up to the present is attached. I can send you the Turion log when
it is done if you want. My recollection is that a Turion is an Athlon
with smaller caches.
>
> Cheers,
> Clint
>
> **************************************************************************
> ** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley **
> **************************************************************************
>



--
M. Edward (Ed) Borasky
http://www.linkedin.com/in/edborasky

I've never met a happy clam. In fact, most of them were pretty steamed.


------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

athlon64x2 (7M) Download Attachment

Re: 3.9.10

by znmeb :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Mar 11, 2009 at 6:55 PM, Clint Whaley <whaley@...> wrote:

> Guys,
>
> I've finally gotten 3.9.10 out.  This is the first release of the new threaded
> system that passes a boatload of tests on the sicortex machine, which with
> its non-x86 processors, and especially its non-power-of-2 # or processors,
> turned out to be a great platform for finding my bugs and bad assumptions.
> It's also got a fix for windows/shared lib building, and a small makefile
> fix.  ChangeLog is below.
>
> Cheers,
> Clint

OK ... I finally got a complete build on the dual core Turion64 X2 4
GB. It took 12 hours and 12 minutes. Is there anything I can send you
that will help future Turion users run this faster??

By the way, I'm integrating ATLAS with R. So far, I've been able to
get the ATLAS blas to link but for some reason the R configure utility
isn't seeing "zgeev_" in the ATLAS lapack libraries. R is compiling
its own lapack from the FORTRAN source. Is that going to be a
performance problem, or is having the ATLAS blas enough?

--
M. Edward (Ed) Borasky
http://www.linkedin.com/in/edborasky

I've never met a happy clam. In fact, most of them were pretty steamed.

------------------------------------------------------------------------------
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

Re: 3.9.10

by Clint Whaley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ed,

>OK ... I finally got a complete build on the dual core Turion64 X2 4
>GB. It took 12 hours and 12 minutes. Is there anything I can send you
>that will help future Turion users run this faster??

You can create architectural defaults for your own use:
   http://math-atlas.sourceforge.net/devel/atlas_devel/atlas_devel.html#SECTION00070000000000000000

I'd have to scope things out more carefully than I have time for right now
before I'd add them to the main package, unfortunately . . .

>By the way, I'm integrating ATLAS with R. So far, I've been able to
>get the ATLAS blas to link but for some reason the R configure utility
>isn't seeing "zgeev_" in the ATLAS lapack libraries. R is compiling
>its own lapack from the FORTRAN source. Is that going to be a
>performance problem, or is having the ATLAS blas enough?

Well, you are losing the benefit of 8 of the hours of tuning you just went
through :)  

If you use stock LAPACK rather than an ATLAS-tuned one, your factorization
performance will be reduced enormously on most platforms.

Cheers,
Clint

**************************************************************************
** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley **
**************************************************************************

------------------------------------------------------------------------------
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

Re: 3.9.10

by znmeb :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Mar 25, 2009 at 12:25 PM, Clint Whaley <whaley@...> wrote:

>>By the way, I'm integrating ATLAS with R. So far, I've been able to
>>get the ATLAS blas to link but for some reason the R configure utility
>>isn't seeing "zgeev_" in the ATLAS lapack libraries. R is compiling
>>its own lapack from the FORTRAN source. Is that going to be a
>>performance problem, or is having the ATLAS blas enough?
>
> Well, you are losing the benefit of 8 of the hours of tuning you just went
> through :)
>
> If you use stock LAPACK rather than an ATLAS-tuned one, your factorization
> performance will be reduced enormously on most platforms.

That's what I was afraid of. Any easy way to tell whether it's ATLAS
or R that's not working?
--
M. Edward (Ed) Borasky
http://www.linkedin.com/in/edborasky

I've never met a happy clam. In fact, most of them were pretty steamed.

------------------------------------------------------------------------------
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

Re: 3.9.10

by Dmitri A. Sergatskov :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Mar 25, 2009 at 4:44 PM, M. Edward (Ed) Borasky
<zznmeb@...> wrote:

>>
>> If you use stock LAPACK rather than an ATLAS-tuned one, your factorization
>> performance will be reduced enormously on most platforms.
>
> That's what I was afraid of. Any easy way to tell whether it's ATLAS
> or R that's not working?

It might be g77 vs gfortran issue.
I kind of remember problems in the past with configure preferring g77
over gfortran. If that is the case you need to specify your fortran compiler
explicitly (eitther via env. variable or as a  parameter to ./configurte).

> --
> M. Edward (Ed) Borasky
> http://www.linkedin.com/in/edborasky
>

Dmitri.
--

------------------------------------------------------------------------------
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

Re: 3.9.10

by Clint Whaley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>>>By the way, I'm integrating ATLAS with R. So far, I've been able to
>>>get the ATLAS blas to link but for some reason the R configure utility
>>>isn't seeing "zgeev_" in the ATLAS lapack libraries. R is compiling
>>>its own lapack from the FORTRAN source. Is that going to be a
>>>performance problem, or is having the ATLAS blas enough?
>>
>> Well, you are losing the benefit of 8 of the hours of tuning you just went
>> through :)
>>
>> If you use stock LAPACK rather than an ATLAS-tuned one, your factorization
>> performance will be reduced enormously on most platforms.
>
>That's what I was afraid of. Any easy way to tell whether it's ATLAS
>or R that's not working?

I am not sure what you mean by this.  If R is building a stock lapack, then
I think all you need to is replace its LAPACK with the ATLAS-generated one.
You should be able to detect that you are successful by seeing that LU, LLt
and QR all get faster (if you use 3.8, rather than 3.9, then only LU and LLt
will get noticably faster).

Cheers,
Clint

**************************************************************************
** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley **
**************************************************************************

------------------------------------------------------------------------------
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel