|
View:
New views
1 Messages
—
Rating Filter:
Alert me
|
|
|
3.9.15Guys,
I've been quiet recently; I have been overwhelmed at work. However, I have been working every spare moment on ATLAS. The task I've been working on is changing the ATLAS matmul search to incorporate a new code generator! Chad Zalkin (student at UTSA working with me on his MS) created a SSE-enabled code generator for ATLAS this summer, and I have just finished getting ATLAS's framework to utilize it. The code generator uses gcc/icc intrinsics to vectorize the main GEMM kernel. The SSE generator's main purpose is to ease ATLAS's reliance on hand-tuning for vectorized kernels. On some machines, it provides speedup over existing hand-tuned kernels (eg., my Core2 system gets about 8% speedup for single precision). I haven't tracked it down yet, but the code generator seems to never provide speedup on the AMD systems I have access to, but does seem to help Intel systems. I'm guessing gcc is generating an instruction stream that intel likes but that is not OK on AMDs, but it'll have to be looked into . . . Chad is still working on the code generator: right now it does not work for single precision complex; I tend not to work very hard on hand-tuning single precision, so performance should probably go up even further when this is fixed. I have not provided architectural defaults for the new search, so the install can be quite long in 3.9.15. However, I thought people would be interested to see the new code generator; if you want a faster install, just continue using 3.9.14 for now. I have also started the process of rationalizing ATLAS's search. ATLAS is now built so that others can easily plug in their own searches and/or code generators into the ATLAS framework. I still need to produce some documentation explaining how to do this, but you can find most things you need in ATLAS/include/atlas_mm[parse,testtime].h. The other nice thing that I think people will like is that I have quietened down the GEMM search. All the compilation and so forth goes to /dev/null, so that it is easier to see the timing results as they are searched . . . Cheers, Clint ATLAS 3.9.15 released 10/10/09, changes from 3.9.14 * Addition of Chad Zalkin's SSE GEMM generator to ATLAS * Support for external searches and use of standard matmul search routs in: - include/atlas_mmparse.h - include/atlas_mmtesttime.h * Numerous search changes to incorporate above in ATLAS matmul install - Changed matmul install to be much quieter ************************************************************************** ** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley ** ************************************************************************** ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Math-atlas-devel mailing list Math-atlas-devel@... https://lists.sourceforge.net/lists/listinfo/math-atlas-devel |
| Free embeddable forum powered by Nabble | Forum Help |