3.9.6

View: New views
3 Messages — Rating Filter:   Alert me  

3.9.6

by Clint Whaley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Guys,

3.9.6 is out.  The main thing new is that I now tune LAPACK for threading in
a separate step (since threaded timings usually want to increase the block
factor more and more quickly).  I also added configure and arch def support
for the new Corei7 processor.  There are also several bug fixes.  The build
process still works only with LAPACK 3.1.1.

Cheers,
Clint

ATLAS 3.9.6 released 02/01/09, Changes from 3.9.5:
   * Made it so LAPACK is tuned specifically for threading as well as for serial
     - Added threaded lapack arch defs for:
       + Core264SSE3, P4E64SSE3, Corei764SSE3
   * Made it so LAPACK NB-tuning is mu/nu aware
   * MIPSICE9 (sicortex) improvements:
     - added pathcc arch defs
     - updated gcc arch defs to better values
     --> Still getting errors on this platform
   * Some bug fixes:
     - Detect model 29 as Core2
     - Rewrote ptFlushAreasByCL to use new thread framework
     - Fixed handling of non-power-of-2 number of threads
     - Better dependencies for building ilaenv

**************************************************************************
** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley **
**************************************************************************

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

Re: 3.9.6

by Michael Abshoff-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Feb 1, 2009 at 1:27 PM, Clint Whaley <whaley@...> wrote:
> Guys,

Hi Clint,

> 3.9.6 is out.  The main thing new is that I now tune LAPACK for threading in
> a separate step (since threaded timings usually want to increase the block
> factor more and more quickly).  I also added configure and arch def support
> for the new Corei7 processor.  There are also several bug fixes.  The build
> process still works only with LAPACK 3.1.1.

I am curious about two things:

 1) I have to build ATLAS 3.8.2 on an Core i7 and as mentioned above
you have your own tuning setting in 3.9.6, so how much of a bad idea
would it be to use the Core2 tuning info for 3.8.2 for the i7 core?
Right now we get an unknown arch and have to do a full tune which is
kind of annoying if you build ATLAS more than once a week on that CPU.
I don't want to switch to ATLAS 3.9.6 yet in the Sage project, but I
don't think the switch will be too far into the future.

 2) A while back I asked about how to treat the Atom (at least the 64
bit models with SSE3 and all that good stuff) and I did not recall
that anyone answered. Somebody else did ask me off list a couple days
about that, so it would be nice to get an answer. My instinct is to
make everything being treated as a Core2 for now.

> Cheers,
> Clint

Cheers,

Michael

> ATLAS 3.9.6 released 02/01/09, Changes from 3.9.5:
>   * Made it so LAPACK is tuned specifically for threading as well as for serial
>     - Added threaded lapack arch defs for:
>       + Core264SSE3, P4E64SSE3, Corei764SSE3
>   * Made it so LAPACK NB-tuning is mu/nu aware
>   * MIPSICE9 (sicortex) improvements:
>     - added pathcc arch defs
>     - updated gcc arch defs to better values
>     --> Still getting errors on this platform
>   * Some bug fixes:
>     - Detect model 29 as Core2
>     - Rewrote ptFlushAreasByCL to use new thread framework
>     - Fixed handling of non-power-of-2 number of threads
>     - Better dependencies for building ilaenv
>
> **************************************************************************
> ** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley **
> **************************************************************************
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by:
> SourcForge Community
> SourceForge wants to tell your story.
> http://p.sf.net/sfu/sf-spreadtheword
> _______________________________________________
> Math-atlas-devel mailing list
> Math-atlas-devel@...
> https://lists.sourceforge.net/lists/listinfo/math-atlas-devel
>

------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel

Re: 3.9.6

by Clint Whaley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Michael,

> 1) I have to build ATLAS 3.8.2 on an Core i7 and as mentioned above
>you have your own tuning setting in 3.9.6, so how much of a bad idea
>would it be to use the Core2 tuning info for 3.8.2 for the i7 core?
>Right now we get an unknown arch and have to do a full tune which is
>kind of annoying if you build ATLAS more than once a week on that CPU.

My *guess* is that your setting arch to Core2 would get you a fine library
using 3.9.6.  Unfortunately, 3.8.x *sucks* for the Core2, because 2-D register
blocking is critical on that machine.  I should have backported the 2-D kernel
to the 3.8 series when I wrote it (using a template provided by Yevgen
Voronenko), but at the time I thought I would be releasing 3.10 in only a few
months.  Unfortunately (in some respects), our research in threading was wildly
successful, which caused me to rewrite the threading subsystem, and then we
were getting good results by tuning lapack, etc., and I got assigned to a bunch
of committees, so that the 3.10 series is *still* not out.

I really should backport the 2-D kernels to 3.8 and issue a bug fix 3.8.3,
but I presently am so swamped that I have had almost no time for ATLAS work,
and what time I had I had to work on new stuff so that I don't block my
student's research . . .

> 2) A while back I asked about how to treat the Atom (at least the 64
>bit models with SSE3 and all that good stuff) and I did not recall
>that anyone answered. Somebody else did ask me off list a couple days
>about that, so it would be nice to get an answer. My instinct is to
>make everything being treated as a Core2 for now.

I would also guess that.  However, for all the questions, I recommend taking
the old pepsi challenge: install ATLAS one time w/o arch defs, and do
"make time".  Now install it where you force the arch equal to Core2
(the -A flag to configure; see xprint_enums for values) and use the defaults
and do "make time".  You can have xatl_bench directly compare the two
timings, as discussed in the install guide.

Cheers,
Clint

**************************************************************************
** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley **
**************************************************************************

------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@...
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel