Octave vs. Matlab speed difference, any suggestions?

View: New views
3 Messages — Rating Filter:   Alert me  

Octave vs. Matlab speed difference, any suggestions?

by szabi-11 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi all!

I need to multiply and sum up 3 dimensional matrices. I noticed that my script run way slower on octave. I traced the problem to the actual multiplication of the matrices. For example take the following script:

--snip --
Q=rand(256,256,10);
W=rand(256,256,10);

tic
for ind=1:10;
        Q(:,:,ind).*W(:,:,ind);
end
toc
--snip--

On a machine having both matlab ( Version 7.5.0.338 (R2007b)) and octave (3.0.5 configured for "x86_64-redhat-linux-gnu") the average timings from a few runs are

matlab: 0.0086 seconds
octave: 0.3350 seconds

Naturally my question is, why?

On a separate machine (on which I have administrative rights) I checked what difference makes if I install an sse2 optimized ATLAS-package instead of the general one. The octave results improved roughly 10%.

-Szabi

Octave vs. Matlab speed difference, any suggestions?

by John W. Eaton-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On  6-Aug-2009, szabi-11 wrote:

| I need to multiply and sum up 3 dimensional matrices. I noticed that my
| script run way slower on octave. I traced the problem to the actual
| multiplication of the matrices. For example take the following script:
|
| --snip --
| Q=rand(256,256,10);
| W=rand(256,256,10);
|
| tic
| for ind=1:10;
|         Q(:,:,ind).*W(:,:,ind);
| end
| toc
| --snip--
|
| On a machine having both matlab ( Version 7.5.0.338 (R2007b)) and octave
| (3.0.5 configured for "x86_64-redhat-linux-gnu") the average timings from a
| few runs are
|
| matlab: 0.0086 seconds
| octave: 0.3350 seconds
|
| Naturally my question is, why?

Most likely, you are measuring the difference in speed of array
indexing.  In 3.0.x, Octave created temporary variables for slices
like Q(:,:,ind).  In 3.2.x, array slices are handled more efficiently.
For example, Octave 3.0.5 on my system runs your example in about .95
seconds, and Octave 3.2.2 on the same system runs it in about .03
seconds.  So I'd recommend upgrading to 3.2.2 and running your test
again.

| On a separate machine (on which I have administrative rights) I checked what
| difference makes if I install an sse2 optimized ATLAS-package instead of the
| general one. The octave results improved roughly 10%.

That's surprising, since I don't think any BLAS routines are used when
computing element-by-element array products.

jwe
_______________________________________________
Help-octave mailing list
Help-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave

Re: Octave vs. Matlab speed difference, any suggestions?

by Jaroslav Hajek-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Aug 6, 2009 at 10:06 PM, John W. Eaton<jwe@...> wrote:

> On  6-Aug-2009, szabi-11 wrote:
>
> | I need to multiply and sum up 3 dimensional matrices. I noticed that my
> | script run way slower on octave. I traced the problem to the actual
> | multiplication of the matrices. For example take the following script:
> |
> | --snip --
> | Q=rand(256,256,10);
> | W=rand(256,256,10);
> |
> | tic
> | for ind=1:10;
> |         Q(:,:,ind).*W(:,:,ind);
> | end
> | toc
> | --snip--
> |
> | On a machine having both matlab ( Version 7.5.0.338 (R2007b)) and octave
> | (3.0.5 configured for "x86_64-redhat-linux-gnu") the average timings from a
> | few runs are
> |
> | matlab: 0.0086 seconds
> | octave: 0.3350 seconds
> |
> | Naturally my question is, why?
>
> Most likely, you are measuring the difference in speed of array
> indexing.  In 3.0.x, Octave created temporary variables for slices
> like Q(:,:,ind).  In 3.2.x, array slices are handled more efficiently.
> For example, Octave 3.0.5 on my system runs your example in about .95
> seconds, and Octave 3.2.2 on the same system runs it in about .03
> seconds.  So I'd recommend upgrading to 3.2.2 and running your test
> again.
>

It may be not just the liboctave indexing part; the sizes are small
enough so that the interpreter overhead may also play its role. Still,
I'd expect major improvement from switching to 3.2.x.


> | On a separate machine (on which I have administrative rights) I checked what
> | difference makes if I install an sse2 optimized ATLAS-package instead of the
> | general one. The octave results improved roughly 10%.
>
> That's surprising, since I don't think any BLAS routines are used when
> computing element-by-element array products.
>

Apparently, then, the error of your measurements is at least 0.03s,
because BLAS has simply nothing to do with this operation (if there's
really .* and not *). tic/toc are only realistic if you try your
measurements on an unloaded machine. You may also use cputime. I would
suggest increasing the size of matrices and the number of loop cycles.

If you'd like to compare the speed of Octave vs. Matlab, the
"benchmark" package might be of interest:
http://octave.sourceforge.net/benchmark/index.html
benchmark_index measures the speed of array indexing. The package is
written with care to allow running on Matlab unchanged (it's just
m-files).

Some 5 months ago Andreas Romeyke compared the BE version of Octave
with Matlab 7.6 on his machine; the result was that Octave mostly won
the benchmark.

I would, of course, love to see more benchmarks (for instance with
more recent versions), but you should remember:
1. for us developers, benchmarking the bleeding edge version is the
most valuable. older versions may lack some improvements, and
benchmarking anything as ancient as 3.0.x is simply pointless for us
(though of course it may be still worth for you, if you're stuck with
it).
2. the benchmark package uses tic/toc rather than cputime, which may
have a fairly coarse resolution (for instance, on my machine, the
resolution seems to be 0.004s at best and may be far worse. If you
have a single-core machine, you must stop as much background processes
as possible, otherwise you may get very misleading results.
for dual core machine, usually one core gets the Octave process (which
keeps up to 100% CPU usage during the run of the benchmark) and the
other core handles other processes, so the result is not that bad;
still, I would avoid browsing flash pages or the like.
3. do not expect miracles from Octave, which is not developed by any
million dollar company. If you want it to be improved, contribute or
donate.

cheers

--
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

_______________________________________________
Help-octave mailing list
Help-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave