|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
3.9.80Guys,
I have released 3.9.80, which is primarily a bug-fix release to fix: https://sourceforge.net/tracker/index.php?func=detail&aid=3537219&group_id=23725&atid=379482 This bug affects any machine using AVX. I also switched all of ATLAS's internal gzip usage to bzip2. Cheers, Clint ATLAS 3.9.80 released 06/23/12, changes from 3.9.79: * Fixed it so ATL_MinMMAlign is 32 when AVX is used * Got rid of HAMMER64SSE2 & HAMMER32SSE3 archdefs; they were for older gcc, and my machine died, so I cannot maintain them * Fixed xmergvecs so MFLOP_max is max, rather than min * Disabled much-abused -Si cputhrchk * Replaced all use of gzip/gunzip with bzip2/bunzip2 ************************************************************************** ** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley ** ************************************************************************** ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Math-atlas-devel mailing list Math-atlas-devel@... https://lists.sourceforge.net/lists/listinfo/math-atlas-devel |
|
|
Re: 3.9.80Can we skip the throttling check If the user specifies the
architecture? If I want to build a generic binary then I'm not doing this on the machine where the output will be running, so whats the point of checking if its throttled? Incidentally, I'd like to get some feedback on "generic" choices for a library that is supposed to run on a wide variety of machines (i.e. Sage binary builds). We currently use configure -A # -V # with numbers computed from the following values: 64-bit Intel: arch = 'x86SSE2' isa_ext = ('SSE2', 'SSE1') 32-bit Intel: arch = 'x86x87' isa_ext = ('3DNow',) SPARC: arch = 'USIII' isa_ext = () PPC: arch = 'POWER4' isa_ext = () Itanium: arch = 'IA64Itan' isa_ext = () ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Math-atlas-devel mailing list Math-atlas-devel@... https://lists.sourceforge.net/lists/listinfo/math-atlas-devel |
|
|
Re: 3.9.80>Can we skip the throttling check If the user specifies the
>architecture? If I want to build a generic binary then I'm not doing >this on the machine where the output will be running, so whats the >point of checking if its throttled? Yes, if you aren't building for the compiling machine anyway, then you can probably get away with turning this off. See the atlas_install extract below for some CacheEdge advice. I have purposely left most of the code in, so that people who insist can turn the check off easily. I removed it because essentially every "atlas auto-build" script I got was throwing it. Looked like to me people just said "hey this makes it install regardless, so I'll just always throw it". The problem is that ATLAS gets its performance by timings, and when CPU throttling is on, the OS's throttling has a much larger affect on performance than almost any optimization that ATLAS applies, which means you get a library with random transformations applied, rather than an optimized lib. Since I couldn't get people to stop throwing the flag, I disabled it. If you use archdefs, then you can still get certain things bad that aren't specified by archdefs (eg., CacheEdge, for many archs), but at least the entire library isn't randomized. The generic archdefs tend to be much more fully specified, since I know that the timings won't hold true for all machines. >Incidentally, I'd like to get some feedback on "generic" choices for a >library that is supposed to run on a wide variety of machines (i.e. >Sage binary builds). We currently use configure -A # -V # with numbers >computed from the following values: > >64-bit Intel: > arch = 'x86SSE2' > isa_ext = ('SSE2', 'SSE1') > >32-bit Intel: > arch = 'x86x87' > isa_ext = ('3DNow',) > >SPARC: > arch = 'USIII' > isa_ext = () These are up-to-date, but I don't have access to a parallel sparc that has a modern gcc on it, and so they don't specify any parallel archdefs, which means your installs will take a long time and do a lot of empirical tuning. You can make your own archdefs on a parallel machine if you want to avoid this. > >PPC: > arch = 'POWER4' > isa_ext = () Does that work for things like G4/G5? There are ISA differences between PowerPC and POWER archs, as well as architectural diffs . . . I think the POWER4 archdefs are completely out-of-date; I have access only to G4/G5, and the G4 just died, so the only thing I can maintain now is G5. >Itanium: > arch = 'IA64Itan' > isa_ext = () I presently have no access to Itaniums, so am no longer able to update these archdefs, which are still for gcc 3, just as for POWER4. So, I'm guessing Itanium & POWER4 are not fully specified, and are also completely out-of-date :( Cheers, Clint ****************************************************************************** \subsubsection{Selecting a good generic CacheEdge} ATLAS uses the CacheEdge macro set in \verb+BLDdir/include/atlas_cacheedge.h+ and \verb+atlas_tcacheedge.h+ to control the L2-cache blocking for the serial and threaded libraries, respectively. You'll want to be sure this value is either set to the minimum of the L2SIZE of any target architecture, or ridiculously large, so that no effective L2 blocking is done. So, if you are using non-celeron x86, it almost always safe to set this value (in both files) to 256K (262144), since almost all archs have at least this much cache. If you know your target machines have more cache than this, then increase this number appropriately. If you may have celerons or other archs with crippled last-level caches, then I recommend you set CacheEdge to \verb+4194304+ (4MB). At this level, CacheEdge doesn't effectively block for caches, but it will tend to keep your workspace requirements down. ************************************************************************** ** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley ** ************************************************************************** ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Math-atlas-devel mailing list Math-atlas-devel@... https://lists.sourceforge.net/lists/listinfo/math-atlas-devel |
| Free embeddable forum powered by Nabble | Forum Help |