|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
3.9.72Guys,
I have released 3.9.72. It has two fairly big changes. The first is that I have decoupled the gcc ISA extension flag (-msse, -msse2, etc) from the compiler flags, and made configure add it automatically based on the probed ISA setting. This change was made to fix this bug: https://sourceforge.net/tracker/?func=detail&aid=3510801&group_id=23725&atid=379482 The second big change is that I rewrote the FPU probe (formerly masearch) that takes place in tune/sysinfo. I was forced to do this because gcc on many platforms had figured out my obfucation that I was doing useless computation during this probe, and was eliminating it. The new probe seems to work better than the old so far. I may need to do something similar for the L1 cache detection, as I've seen that go bad on a few archs as well. I also improved some architectural defaults. Cheers, Clint ATLAS 3.9.72 released 03/30/12, changes from 3.9.71: * Added missing [s,c] files in Dozer64 archdefs * Provided new fpu probe (?MULADD files) that works better with modern gcc * Added new archdefs for P4E64SSE3, HAMMER64SSE3 * Made it so -msse/avx/etc autoadded to gcc default flags * Fixed it so archdef install doesn't rerun gmmsearch unnecessarily ************************************************************************** ** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley ** ************************************************************************** ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Math-atlas-devel mailing list Math-atlas-devel@... https://lists.sourceforge.net/lists/listinfo/math-atlas-devel |
|
|
Re: 3.9.72On Fri, Mar 30, 2012 at 2:36 PM, Clint Whaley <whaley@...> wrote:
> Guys, > > I have released 3.9.72. It has two fairly big changes. The first is that > I have decoupled the gcc ISA extension flag (-msse, -msse2, etc) from the > compiler flags, and made configure add it automatically based on the probed > ISA setting. This change was made to fix this bug: > https://sourceforge.net/tracker/?func=detail&aid=3510801&group_id=23725&atid=379482 > > The second big change is that I rewrote the FPU probe (formerly masearch) > that takes place in tune/sysinfo. I was forced to do this because gcc on > many platforms had figured out my obfucation that I was doing useless > computation during this probe, and was eliminating it. The new probe seems > to work better than the old so far. I may need to do something similar for > the L1 cache detection, as I've seen that go bad on a few archs as well. > > I also improved some architectural defaults. > > Cheers, > Clint > > ATLAS 3.9.72 released 03/30/12, changes from 3.9.71: > * Added missing [s,c] files in Dozer64 archdefs > * Provided new fpu probe (?MULADD files) that works better with modern gcc > * Added new archdefs for P4E64SSE3, HAMMER64SSE3 > * Made it so -msse/avx/etc autoadded to gcc default flags > * Fixed it so archdef install doesn't rerun gmmsearch unnecessarily > > ************************************************************************** > ** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley ** > ************************************************************************** > > ------------------------------------------------------------------------------ > This SF email is sponsosred by: > Try Windows Azure free for 90 days Click Here > http://p.sf.net/sfu/sfd2d-msazure > _______________________________________________ > Math-atlas-devel mailing list > Math-atlas-devel@... > https://lists.sourceforge.net/lists/listinfo/math-atlas-devel hiatus. I just did a build on my laptop of 3.9.72 with full LAPACK and it looks like it's peaking at 17 GFLOPs (openSUSE 12.1, /proc/cpuinfo attached.) This is with the default flags - I don't see much point in trying to game it when I'm getting that much horsepower out of a $600 machine and it only takes a few minutes to do the install. ;-) The bad news is that I want to run this in a virtual machine. I have kvm and xen on my Linux machines, and both VirtualBox and VMware Workstation 8 on Windows. I'm hoping eventually to extend this to an Amazon EC2 instance and an OpenStack compute node. What's a reasonable strategy for building images? I don't know what the low-end processors are in host servers these days. They pretty much have to be 64-bit just to get the Intel or AMD virtualization bits, but other than that, how bad can they be / how generic a build do I have to make? -- Twitter: http://twitter.com/znmeb Data Journalism Developer Studio 2012LX http://j.mp/DJDS2012LX "A mathematician is a device for turning coffee into theorems." -- Paul Erdős ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Math-atlas-devel mailing list Math-atlas-devel@... https://lists.sourceforge.net/lists/listinfo/math-atlas-devel |
|
|
Re: 3.9.72>I'm getting back to testing ATLAS (and R using ATLAS) after a long
>hiatus. I just did a build on my laptop of 3.9.72 with full LAPACK and >it looks like it's peaking at 17 GFLOPs (openSUSE 12.1, /proc/cpuinfo >attached.) This is with the default flags - I don't see much point in >trying to game it when I'm getting that much horsepower out of a $600 >machine and it only takes a few minutes to do the install. ;-) > >The bad news is that I want to run this in a virtual machine. I have >kvm and xen on my Linux machines, and both VirtualBox and VMware >Workstation 8 on Windows. I'm hoping eventually to extend this to an >Amazon EC2 instance and an OpenStack compute node. What's a reasonable >strategy for building images? I don't know what the low-end processors >are in host servers these days. They pretty much have to be 64-bit >just to get the Intel or AMD virtualization bits, but other than that, >how bad can they be / how generic a build do I have to make? My ESP is not bringing in the picture clearly. Remember that we *do* now have the ability to build generic defaults for pretty much all hardware by manually setting the architecture to one of the new generic (x8664x87,x8664SSE1,etc.), and manually setting the ISA extension to use (SSE1,SSE2,SSE3). If you want something that runs on every 32-bit machine ever produced, you pick x8632x87; this uses x87 unit, and in the worst case (old intel hardware) will reduce your peak by 1/4 for single and 1/2 for double. I don't think any 64-bit machine ever lacked SSE1. The way I guess I would approach it is to run the ATLAS configure process on as many instances as you can, and see what the least capable architecture that is ever detected is. Or perhaps you could ask someone what the guaranteed ISA compatibility is? ? Clint ************************************************************************** ** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley ** ************************************************************************** ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Math-atlas-devel mailing list Math-atlas-devel@... https://lists.sourceforge.net/lists/listinfo/math-atlas-devel |
|
|
|
|
|
Re: 3.9.72Forcing value 256 instead of 0 fixed the issue. I think you have a bug
here. However, I still have the same kind of issue. make[6]: quittant le répertoire « /home/sylvestre/dev/debian/debian-science/packages/atlas/branches/atlas-3.9/build-area/atlas3.9-3.9.72/build/atlas-base/tune/blas/gemm » EXTERNAL SEARCH FAILED: res/dMMKSSE.sumSUCCESSFUL FINISH FOR ./xummsearch TESTING PRE='d' FILE='ATL_mm4x4x2US.c', NB=60 . . . PASSED! 0. NB= 60, rout= ATL_mm4x4x2US.c, MFLOP=1602.76 TESTING PRE='d' FILE='ATL_mm4x4x2_1_pref.c', NB=60 . . . PASSED! 1. NB= 60, rout= ATL_mm4x4x2_1_pref.c, MFLOP=1664.88 2. NB= 60, rout= ATL_mm4x4x2_1_prefCU.c, MFLOP=0.00 TESTING PRE='d' FILE='ATL_mm4x4x8_bpfabc.c', NB=56 . . . PASSED! 3. NB= 56, rout= ATL_mm4x4x8_bpfabc.c, MFLOP=1920.01 4. NB= 48, rout= ATL_mm4x3x8p.c, MFLOP=1607.99 5. NB= 60, rout= ATL_mm4x3x2p.c, MFLOP=1784.31 6. NB= 56, rout= ATL_mm4x4x8p.c, MFLOP=1885.88 7. NB= 56, rout= ATL_mm4x4x56_av.c, MFLOP=-1.00 8. NB= 56, rout= ATL_mm4x4x8_av.c, MFLOP=-1.00 9. NB= 60, rout= ATL_mm4x4x4_av.c, MFLOP=-1.00 10. NB= 60, rout= ATL_mm4x4x4_av.c, MFLOP=0.00 11. NB= 48, rout= ATL_mm6x8x8_1p.c, MFLOP=1505.21 12. NB= 58, rout= ATL_mm6x8x8_1p.c, MFLOP=0.00 TESTING PRE='d' FILE='ATL_dmm_julian_gas_30.c', NB=30 . . . PASSED! 13. NB= 30, rout= ATL_dmm_julian_gas_30.c, MFLOP=2277.62 TESTING PRE='d' FILE='ATL_dmm2x1x40_5pABC.c', NB=40 . . . PASSED! 14. NB= 40, rout= ATL_dmm2x1x40_5pABC.c, MFLOP=2312.66 15. NB= 60, rout= ATL_dmm6x1x30_x87.c, MFLOP=-1.00 17. NB= 56, rout= ATL_mm8x8x2.c, MFLOP=1406.43 18. NB= 48, rout= ATL_dmm4x4x16_hppa.c, MFLOP=-1.00 19. NB= 60, rout= ATL_dmm4x1x90_x87.c, MFLOP=2302.14 20. NB= 56, rout= ATL_dmm8x1x120_sse2.c, MFLOP=-1.00 21. NB= 48, rout= ATL_mm6x8x8_1p.c, MFLOP=1521.40 22. NB= 58, rout= ATL_mm6x8x8_1p.c, MFLOP=0.00 24. NB= 56, rout= ATL_mm8x8x2.c, MFLOP=1411.80 25. NB= 60, rout= ATL_dmm4x4x80_ppc.c, MFLOP=-1.00 BEST CASE IS ID=317, NB=40, MFLOP=2312.66 BEST BLOCKING FACTOR FOR CASE 317: 40 BEST USER CASE 317, NB=40: 2312.66 MFLOP BEGIN BASIC MATMUL KERNEL TESTS: Kernel CASES/ATL_dmm2x1x40_5pABC.c(317) passes basic tests DONE BASIC KERNEL TESTS. BEST OF USER CASES AND EXTERNAL SEARCHES: ID=317 ROUT='CASES/ATL_dmm2x1x40_5pABC.c' AUTH='R. Clint Whaley' TA='T' TB='N' \ MULADD=0 PREF=0 LAT=4 NFTCH=0 IFTCH=0 FFTCH=0 KBMAX=0 KBMIN=0 KU=40 \ NU=1 MU=2 MB=40 NB=40 KB=40 L14NB=0 PFBCOLS=0 PFABLK=0 PFACOLS=0 STFLOAT=0 \ LDFLOAT=0 AOUTER=0 LDAB=1 BETAN1=0 LDISKB=1 KUISKB=0 KRUNTIME=0 NRUNTIME=1 \ MRUNTIME=1 LDCTOP=0 X87=0 \ MFLOP=2.312660e+03,2.323768e+03 CFLAGS='-x assembler-with-cpp' \ COMP='/usr/bin/gcc -Wa,--noexecstack -fPIC -m32' make[5]: quittant le répertoire « /home/sylvestre/dev/debian/debian-science/packages/atlas/branches/atlas-3.9/build-area/atlas3.9-3.9.72/build/atlas-base/tune/blas/gemm » xmmsearch: /home/sylvestre/dev/debian/debian-science/packages/atlas/branches/atlas-3.9/build-area/atlas3.9-3.9.72/build/atlas-base/../..//include/atlas_genparse.h:166: GetDoubleArr: Assertion `sscanf(str, "%le", d+i) == 1' failed. INVOKING GMMSEARCH.C, PRE='d' DONE GMMSEARCH.C, PRE='d' RUNNING EXTERNAL SEARCHES, PRE='d', NB=58: make[4]: *** [res/dMMRES.sum] Abandon make[4]: quittant le répertoire « /home/sylvestre/dev/debian/debian-science/packages/atlas/branches/atlas-3.9/build-area/atlas3.9-3.9.72/build/atlas-base/tune/blas/gemm » make[3]: *** [/home/sylvestre/dev/debian/debian-science/packages/atlas/branches/atlas-3.9/build-area/atlas3.9-3.9.72/build/atlas-base/tune/blas/gemm/res/dMMRES.sum] Erreur 2 make[3]: quittant le répertoire « /home/sylvestre/dev/debian/debian-science/packages/atlas/branches/atlas-3.9/build-area/atlas3.9-3.9.72/build/atlas-base/bin » xatlas_build: /home/sylvestre/dev/debian/debian-science/packages/atlas/branches/atlas-3.9/build-area/atlas3.9-3.9.72/build/atlas-base/../..//bin/atlas_install.c:706: GoToTown: Assertion `mmp' failed. /bin/sh : ligne 1 : 23108 Abandon ./xatlas_build -1 0 -a 1 -l 1 make[2]: *** [build] Erreur 134 with ../../configure -D c -DWALL -b 32 -Fa alg '-Wa,--noexecstack -fPIC' -A 0 -V 256 -v 2; Cheers, Sylvestre Le lundi 02 avril 2012 à 10:24 -0500, Clint Whaley a écrit : > Sylvetre, > > As far as I can tell, you must have put the wrong value on the -V configure > argument, because this flag only gets used if ATLAS detects/IS TOLD that > VSX is on using this flag. Your configure example showed you using a > variable for this flag, but not the value. To see what value you want > to use with -V, run "make xprint_enums ; ./xprint_enums" to see what > the values of vector extensions are, and then add together all the > values that you want to set (for instance SSE1 & SSE2 = 128+256 = 384). > > Probably you are using the value that used to work for .71, but .72 > added the vector extension AVXFMA4, which changed the values . . . > > Regards, > Clint > On 02/04/2012 16:23, Clint Whaley wrote: > >> It is me again... I am trying with the 72 and it is failing: > >> OUTPUT: > >> ======= > >> cmnd=make IRunCComp CC='/usr/bin/gcc' CCFLAGS='-fomit-frame-pointer > >> -mfpmath=sse -O2 -mvsx -Wa,--noexecstack -fPIC -m32' | fgrep SUCCESS > >> /usr/bin/gcc -fomit-frame-pointer -mfpmath=sse -O2 -mvsx > >> -Wa,--noexecstack -fPIC -m32 : FAILURE! > > This is the problem. Notice the compiler flags show -mvsx *AND* -mfpmath=sse > > which cannot both be true. -mvsx works only on newer POWER archs, and > > fpmath works only on x86. What is the ARCH of this machine? > > > the same as you accessed to. my x86 > S > > > ************************************************************************** > ** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley ** > ************************************************************************** ------------------------------------------------------------------------------ For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 _______________________________________________ Math-atlas-devel mailing list Math-atlas-devel@... https://lists.sourceforge.net/lists/listinfo/math-atlas-devel |
| Free embeddable forum powered by Nabble | Forum Help |