ATLAS 3.9.0 & LAPACK

View: New views
2 Messages — Rating Filter:   Alert me  

ATLAS 3.9.0 & LAPACK

by Clint Whaley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Guys,

Its been a long time coming, but I have finally heaved out 3.9.0.  The
main reason for the this long delay is that I did a major rewrite of
ATLAS for additional rank-K performance, which timings showed was a
big win **until I fixed the performance bug that mandated 3.8.2**.  After
that, I found I had written thousands of lines of code for nothing, so
I had to yank the code back out :(

However, 3.9.0 is finally out, and it has several key features that I hope will
make it worth the wait.  There are much improved DGEMM kernels for Core2Duo64
and K10h64 architectures.  These kernels (particulary K10h) can still be
improved, and I haven't yet ported them to single precision or 32 bits.
However, this should provide some relief on the Core2Duo, where ATLAS was
taking a savage beat-down from Goto and MKL blas.  ATLAS still trails Goto, but
it is not quite the same excoriating humiliation now (at least for for double).
The key to the Core2Duo64 was doing 2D blocking, which I had tested but
apparently messed up before.  Thanks to Yevgen Voronenko of CMU/SPIRAL, who
gave me a code fragment to work from (see ATLAS/doc/AtlasCredits.txt for
details).

The main focus of 3.9.0 has been in improving ATLAS's LAPACK
support.  The first of these is that you no longer have to install
LAPACK separately from ATLAS.  If you have LAPACK 3.1.1 untarred somewhere,
you can use the flag '-Ss lasrc /path/to/lapack3.1.1/SRC', and ATLAS
will automatically build it during the ATLAS build, with no need for the
flag/make.inc headaches that we have in the 3.8 series.   You can also
provide '--with-netlib-lapack-tarfile=/path/to/tarfile' and ATLAS will
extract the tarfile for you in the ATLAS directory, and build it from
there.  If you have more than one install, you can save space by using the
-Ss flag, so that all ATLAS installs share one copy of the LAPACK source,
so I recommend the first method.

The second big lapack push for this release is that I've started to
support a new C API for lapack, which I hope to eventually expand to
all of LAPACK.  For most of the routines, it calls the F77LAPACK, but
for ATLAS native routines (like LU/Cholesky) it calls ATLAS's faster
routines instead.  The name is the f77 name, in lower case, with a "C_"
prepended.  Thus DGETRF is C_dgetrf.  Character arguments (Uplo, Trans, etc)
are replaced by CBLAS enum types, and all (non-complex) scalars are passed
by value.  This API supports only column major arrays (it mostly calls
the F77/netlib lapack, which are column-major only).  Routines that take
workspace in F77 don't in the C_ equivalents, as the wrapper auto-queries
LAPACK and allocates.  However, if you want to allocate the work yourself, the
routine taking workspace usually exists with the name C_rout_wrk and you can
test if it exists by doing (it may not exist if ATLAS supports the routine
natively):
   #ifdef ATL_C2F<rout>_wrk__   (eg., ATL_C2Fdgels_wrk__)

This API is currently supported for the following LAPACK routines:
   ATLAS native routines:
      xPOSV xGESV xPOTRF xGETRF xPOTRS xPOTRI xLAUMM
   C2F wrappers:
      xGELS xGELQF xGERQF xGEQLF xGEQRF

Obviously, you need to build the full lapack library (and thus need a
functional F77 compiler) to use these routines.
You can find more info in the following files found in ATLAS/include:
    C_lapack.h    # main header file you must include to use the C_lapack API
    clapack.h     # header for ATLAS's native lapack
    atlas_C2Flapack.h   # header for C to F77 wrapper functions.

I would like to get some feedback on this new API.  I use macros to select
between native & C2F files to save some calling overhead.  Is this real bad
news for people?  Will it make your life easier to have a full C API supported
out-of-the-box for ATLAS?  If there is a demand for this API, I can fill
it out fairly rapidly (with some help from you guys for testing); if there's
not, I will populate it only as needed for internal ATLAS stuff.
So, speak up if you are interested!

Finally, the last lapack deal is that ATLAS can now tune some
of the lapack routines that it doesn't natively support by empirically
tuning LAPACK's blocking factor to both the platform and problem size.
Right now, ATLAS autotunes only the QR factorization routines mentioned
above.  Initial timings show improvements ranging from 5-25% (as much as
75% for small problems on Itanium!).  Core2Duo64SSE3 has arch defaults
with QR pretuned.  Be default, ATLAS does not tune LAPACK.  To enable it,
you pass '-Si latune 1' to configure.  You will only want to do this if
QR (or one of the many LAPACK routines that call it) is important to you:
my present BFI lapack tuning framework adds roughly 3 hours to a *fast*
machine's install!

Cheers,
Clint

**************************************************************************
** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley **
**************************************************************************

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

Re: ATLAS 3.9.0 & LAPACK

by Matthew Brett :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Clint,

I'm running on Ubuntu hardy, pentium M laptop, lapack 3.1.1 and atlas 3.9.0

When I configure with ../configure, all goes well.

With

../configure -Ss lasrc /path/to/lapack/src

I get a segmentation fault:

cat config1.out
gfortran -O -m32 -c
/root/installs/atlas/atlas-3.9.0/millroad/..//CONFIG/src/backend/flibchkF.f
gcc -O -fomit-frame-pointer -m32 -L/usr/lib/gcc/i486-linux-gnu/4.2.3
-o xflibchk /root/installs/atlas/atlas-3.9.0/millroad/..//CONFIG/src/backend/flibchkC.c
\
              flibchkF.o -l gfortran -lm
Segmentation fault
xconfig exited with 139

It doesn't seem to matter if the path to lapack src exists or not.

Just to let you know...

Matthew

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel