LAPACK speed in Windows distribution

View: New views
8 Messages — Rating Filter:   Alert me  

LAPACK speed in Windows distribution

by Alexander Mamonov-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have recently built Octave from recent sources with MinGW, and the
first thing I tried was
> tic; lu(rand(1000,1000)); toc;
For my build of Octave I compiled a plain vanilla unoptimized LAPACK
from netlib (LAPACK-lite), and the result of 0.5 sec versus M*lab's
0.24 sec was not surprising to me. Then I compared it with Octave
3.0.2 (MinGW) and 3.0.3 (VC) from octave-forge. Both were installed
with ATLAS chosen by the installer for my machine (SSE3 I believe). To
my astonishment both 3.0.2 and 3.0.3 have consistently shown a result
of 1.4sec. That's almost three times slower than the unoptimized (!)
LAPACK, and roughly six times slower than the result of the commercial
competitor.
I want Windows maintainers to be aware of this issue, so that some
improvements can be made in future versions.
Regards,

Alex

Re: LAPACK speed in Windows distribution

by Tatsuro MATSUOKA-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello

--- Alexander Mamonov wrote:

> I have recently built Octave from recent sources with MinGW, and the
> first thing I tried was
> > tic; lu(rand(1000,1000)); toc;
> For my build of Octave I compiled a plain vanilla unoptimized LAPACK
> from netlib (LAPACK-lite), and the result of 0.5 sec versus M*lab's
> 0.24 sec was not surprising to me. Then I compared it with Octave
> 3.0.2 (MinGW) and 3.0.3 (VC) from octave-forge. Both were installed
> with ATLAS chosen by the installer for my machine (SSE3 I believe). To
> my astonishment both 3.0.2 and 3.0.3 have consistently shown a result
> of 1.4sec. That's almost three times slower than the unoptimized (!)
> LAPACK, and roughly six times slower than the result of the commercial
> competitor.
> I want Windows maintainers to be aware of this issue, so that some
> improvements can be made in future versions.
> Regards,
>
> Alex
Speed on ATLAS strongly depend on the difference in CPU architecture and/or code generated by the
complier. I think it is very difficult problem to solve for generally provided binaries.
 
If you need higher performance for matrix, you should try to build ATLAS or GotoBLAS by your computer
and build octave on cygwin or MinGW.

In my computer (HT^Pentium, prescott 3.4GHz ), I have got 4 times higher Matrix calculation
performance than that obtained octave 3.0.2 (MinGW).

If you want to build octave from source, I have prepared library kit for MinGW build.

http://www.tatsuromatsuoka.com/octave/Eng/Win/index.html

0005 OctaveBuild.zip, 8,380,551 bytes, 2009-04-28, md5 3587b65873be7d5e2b38a671162fa61e, octave build
tool kit under the MinGW
0006 ReadmeBriefOctBuildMingw.html, 11,748 bytes, 2009-05-04, md5 1a52737ad283dfd8178159edc1720dc3,
Brief explanation for the octave build tool kit under the MinGW, plese read this before use.

Regards

Tatsuro



--------------------------------------
Power up the Internet with Yahoo! Toolbar.
http://pr.mail.yahoo.co.jp/toolbar/

Re: LAPACK speed in Windows distribution

by Alexander Mamonov-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello Tatsuro,

I understand that performance of ATLAS depends greatly on the system
which was used to compile it. Since the Windows binary distributions
already ship with several different LAPACK libraries built for
different architectures, maybe it would make sense to add a little
code to the installer, that would test the performance of these
libraries (at the time of installation) to choose the one that gives
the optimal performance for a given machine?
Thank you for providing the build kit for MinGW users. May I suggest
that you expand the kit by including in it all dependencies required
to build Octave. This will provide a simple drop-in solution for
Windows users: install MinGW/MSYS -> add the kit -> build Octave. If
you think it's worth doing, I can provide you with a set of binaries
that I have built with octave-forge patches and scripts
(https://octave.svn.sourceforge.net/svnroot/octave/trunk/octave-forge/admin/Windows/mingw32/).

Regards,
Alex

2009/5/11 Tatsuro MATSUOKA <tmacchant@...>:

> Hello
>
> --- Alexander Mamonov wrote:
>
>> I have recently built Octave from recent sources with MinGW, and the
>> first thing I tried was
>> > tic; lu(rand(1000,1000)); toc;
>> For my build of Octave I compiled a plain vanilla unoptimized LAPACK
>> from netlib (LAPACK-lite), and the result of 0.5 sec versus M*lab's
>> 0.24 sec was not surprising to me. Then I compared it with Octave
>> 3.0.2 (MinGW) and 3.0.3 (VC) from octave-forge. Both were installed
>> with ATLAS chosen by the installer for my machine (SSE3 I believe). To
>> my astonishment both 3.0.2 and 3.0.3 have consistently shown a result
>> of 1.4sec. That's almost three times slower than the unoptimized (!)
>> LAPACK, and roughly six times slower than the result of the commercial
>> competitor.
>> I want Windows maintainers to be aware of this issue, so that some
>> improvements can be made in future versions.
>> Regards,
>>
>> Alex
> Speed on ATLAS strongly depend on the difference in CPU architecture and/or code generated by the
> complier. I think it is very difficult problem to solve for generally provided binaries.
>
> If you need higher performance for matrix, you should try to build ATLAS or GotoBLAS by your computer
> and build octave on cygwin or MinGW.
>
> In my computer (HT^Pentium, prescott 3.4GHz ), I have got 4 times higher Matrix calculation
> performance than that obtained octave 3.0.2 (MinGW).
>
> If you want to build octave from source, I have prepared library kit for MinGW build.
>
> http://www.tatsuromatsuoka.com/octave/Eng/Win/index.html
>
> 0005 OctaveBuild.zip, 8,380,551 bytes, 2009-04-28, md5 3587b65873be7d5e2b38a671162fa61e, octave build
> tool kit under the MinGW
> 0006 ReadmeBriefOctBuildMingw.html, 11,748 bytes, 2009-05-04, md5 1a52737ad283dfd8178159edc1720dc3,
> Brief explanation for the octave build tool kit under the MinGW, plese read this before use.
>
> Regards
>
> Tatsuro
>
>
>
> --------------------------------------
> Power up the Internet with Yahoo! Toolbar.
> http://pr.mail.yahoo.co.jp/toolbar/
>

Re: LAPACK speed in Windows distribution

by Tatsuro MATSUOKA-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello Alex

Thank you for your comment and suggestion.

I will consider your suggestions.
Please give me a time to consider what will be  best in my limited time.

Anyway I appreciate your comments and am glad you to come to the octave-ML as a person who can treat
MinGW/Msys system for octave building.

Regards

Tatsuro

--- Alexander Mamonov wrote:

> Hello Tatsuro,
>
> I understand that performance of ATLAS depends greatly on the system
> which was used to compile it. Since the Windows binary distributions
> already ship with several different LAPACK libraries built for
> different architectures, maybe it would make sense to add a little
> code to the installer, that would test the performance of these
> libraries (at the time of installation) to choose the one that gives
> the optimal performance for a given machine?
> Thank you for providing the build kit for MinGW users. May I suggest
> that you expand the kit by including in it all dependencies required
> to build Octave. This will provide a simple drop-in solution for
> Windows users: install MinGW/MSYS -> add the kit -> build Octave. If
> you think it's worth doing, I can provide you with a set of binaries
> that I have built with octave-forge patches and scripts
> (https://octave.svn.sourceforge.net/svnroot/octave/trunk/octave-forge/admin/Windows/mingw32/).
>
> Regards,
> Alex
>


--------------------------------------
Power up the Internet with Yahoo! Toolbar.
http://pr.mail.yahoo.co.jp/toolbar/

Re: LAPACK speed in Windows distribution

by maiky76 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Not sure it it can help but here teh results of a small test I found here on the forum:

a=randn(2000);b=randn(2000);tic; c=a*b; toc,
Intel C2D T8100 (2x2.1GHz) + 2Gb ram
Matlab 7.7.0.471 (R2008b) = 1.4s
Matlab 7.0.0.19920 (R14) = 7.7s
Octave 3.0.2 with SSE3 (the installer available on the website) = 2.4s
Octave 3.0.3 (Tatsuro's build) with SSE2, no SSE3 available = 7.5s

BTW I tried the test on an ASUS EEEPC (Intel Atom 1.6GHz + 1Gb ram netbook) the result is 45s with SSE2 same result w and w/o Hyperthreading.
Cheers
Mickael


Re: LAPACK speed in Windows distribution

by maiky76 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi,

Fogot to say that with the generic install (no SSE at all)
Octave 3.0.3 no SSE3 available = 17.5s
a=randn(2000);b=randn(2000);tic; c=a*b; toc,
Intel C2D T8100 (2x2.1GHz) + 2Gb ram
Matlab 7.7.0.471 (R2008b) = 1.4s
Matlab 7.0.0.19920 (R14) = 7.7s
Octave 3.0.2 with SSE3 (the installer available on the website) = 2.4s
Octave 3.0.3 (Tatsuro's build) with SSE2, no SSE3 available = 7.5s

Mickael Lefebvre, MEng, MSc.
Acoustic team leader
--
View this message in context: http://www.nabble.com/LAPACK-speed-in-Windows-distribution-tp23493499p23514652.html
Sent from the Octave - Maintainers mailing list archive at Nabble.com.


Re: LAPACK speed in Windows distribution

by maiky76 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Fogot to say that with the generic install (no SSE at all)
Octave 3.0.3 no SSE= 17.5s
a=randn(2000);b=randn(2000);tic; c=a*b; toc,
Intel C2D T8100 (2x2.1GHz) + 2Gb ram
Matlab 7.7.0.471 (R2008b) = 1.4s
Matlab 7.0.0.19920 (R14) = 7.7s
Octave 3.0.2 with SSE3 (the installer available on the website) = 2.4s
Octave 3.0.3 (Tatsuro's build) with SSE2, no SSE3 available = 7.5s

Mickael Lefebvre, MEng, MSc.
Acoustic team leader

Parent Message unknown Re: LAPACK speed in Windows distribution

by Tatsuro MATSUOKA-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello

--- maiky76  wrote:

> Octave 3.0.3 (Tatsuro's build) with SSE2, no SSE3 available = 7.5s
This binary is distributed by Michael not nut by me

Regards

Tatsuro

--------------------------------------
Power up the Internet with Yahoo! Toolbar.
http://pr.mail.yahoo.co.jp/toolbar/