|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
Improved triu.m and tril.mDear maintainers,
although triu and tril are built-in functions in Matlab and then much faster, I discovered that it is possible to improve the m files in Octave, by eliminating the min/max evaluations inside the loops and selecting whether to replace numbers with zeros or zeros wih numbers. The results, for my Octave 3.2.3, are: > A=rand(100); > i=randperm(200)(1)-100,tic,U=triu(A,i);,toc,tic,myU=mytriu(A,i);,toc,norm(U-myU) i = -31 Elapsed time is 0.0221 seconds. Elapsed time is 0.00397 seconds. ans = 0 > i=randperm(200)(1)-100,tic,U=triu(A,i);,toc,tic,myU=mytriu(A,i);,toc,norm(U-myU) i = -52 Elapsed time is 0.0224 seconds. Elapsed time is 0.00114 seconds. ans = 0 > i=randperm(200)(1)-100,tic,U=triu(A,i);,toc,tic,myU=mytriu(A,i);,toc,norm(U-myU) i = 22 Elapsed time is 0.0174 seconds. Elapsed time is 0.00495 seconds. ans = 0 > i=randperm(200)(1)-100,tic,U=triu(A,i);,toc,tic,myU=mytriu(A,i);,toc,norm(U-myU) i = 63 Elapsed time is 0.00873 seconds. Elapsed time is 0.00261 seconds. ans = 0 > i=randperm(200)(1)-100,tic,U=triu(A,i);,toc,tic,myU=mytriu(A,i);,toc,norm(U-myU) i = 36 Elapsed time is 0.0146 seconds. Elapsed time is 0.00413 seconds. ans = 0 For me the speed-up is > 3. Enclosed, the patches wrt 3.2.3. Best regards, Marco [triu.m.diff] --- triu.m.orig 2009-10-22 17:36:28.000000000 +0200 +++ triu.m 2009-10-22 17:49:10.000000000 +0200 @@ -28,25 +28,70 @@ if (nargin > 0) if (isstruct (x)) - error ("tril: structure arrays not supported"); + error ("triu: structure arrays not supported"); endif [nr, nc] = size (x); endif if (nargin == 1) k = 0; elseif (nargin == 2) - if ((k > 0 && k > nc) || (k < 0 && k < -nr)) + if ((k > 0 && k > nc) || (k < 0 && k < -nr+1)) error ("triu: requested diagonal out of range"); endif else print_usage (); endif - - retval = resize (resize (x, 0), nr, nc); - for j = max (1, k+1) : nc - nr_limit = min (nr, j-k); - retval (1:nr_limit, j) = x (1:nr_limit, j); - endfor + if (nr == nc) + if (k <= 0) + retval = x; + for j = 1 : nr-1+k + retval((j-1)*nr+j+1-k:j*nr) = 0; + endfor + else +% k > 0 + retval = zeros(size(x)); + for j = nc : -1 : k+1 + idx = (j-1)*nr+1:(j-1)*nr-k+j; + retval(idx) = x(idx); + endfor + endif + elseif (nr > nc) + if (k <= floor((1+nc-nr)/2)) + retval = x; + for j = 1 : min(nc,nr-1+k) + retval((j-1)*nr+j+1-k:j*nr) = 0; + endfor + else +% k > floor((1+nc-nr)/2) + retval = zeros(size(x)); + for j = nc : -1 : max(1,k+1) + idx = (j-1)*nr+1:(j-1)*nr-k+j; + retval(idx) = x(idx); + endfor + endif + else +% nc > nr + if (k <= floor((1+nc-nr)/2)) + retval = x; + for j = 1 : k-1 + retval((j-1)*nr+1:j*nr) = 0; + endfor + for j = max(1,k) : nr-1+k + retval((j-1)*nr+j+1-k:j*nr) = 0; + endfor + else +% k > floor((1+nc-nr)/2) + retval = zeros(size(x)); + for j = nc : -1 : k+nr+1 + idx = (j-1)*nr+1:j*nr; + retval(idx) = x(idx); + end + for j = min(nc,k+nr) : -1 : k+1 + idx = (j-1)*nr+1:(j-1)*nr-k+j; + retval(idx) = x(idx); + endfor + endif + endif endfunction [tril.m.diff] --- tril.m.orig 2009-10-22 17:36:30.000000000 +0200 +++ tril.m 2009-10-22 17:49:14.000000000 +0200 @@ -72,22 +72,66 @@ endif [nr, nc] = size (x); endif - if (nargin == 1) k = 0; elseif (nargin == 2) - if ((k > 0 && k > nc) || (k < 0 && k < -nr)) + if ((k > 0 && k > nc-1) || (k < 0 && k < -nr)) error ("tril: requested diagonal out of range"); endif else print_usage (); endif - - retval = resize (resize (x, 0), nr, nc); - for j = 1 : min (nc, nr+k) - nr_limit = max (1, j-k); - retval (nr_limit:nr, j) = x (nr_limit:nr, j); - endfor + if (nr == nc) + if (k < 0) + retval = zeros(size(x)); + for j = 1 : nr+k + idx = (j-1)*nr+j-k:j*nr; + retval(idx) = x(idx); + endfor + else +% k >= 0 + retval = x; + for j = nc : -1 : k+2 + retval((j-1)*nr+1:(j-1)*nr-k-1+j) = 0; + endfor + endif + elseif (nr > nc) + if (k < floor((1+nc-nr)/2)) + retval = zeros(size(x)); + for j = 1 : min(nc,nr+k) + idx = (j-1)*nr+j-k:j*nr; + retval(idx) = x(idx); + endfor + else +% k >= floor((1+nc-nr)/2) + retval = x; + for j = nc : -1 : max(1,k+2) + retval((j-1)*nr+1:(j-1)*nr-k-1+j) = 0; + endfor + endif + else +% nc > nr + if (k < floor((1+nc-nr)/2)) + retval = zeros(size(x)); + for j = 1 : k + idx = (j-1)*nr+1:j*nr; + retval(idx) = x(idx); + endfor + for j = max(1,k+1) : nr+k + idx = (j-1)*nr+j-k:j*nr; + retval(idx) = x(idx); + endfor + else +% k >= floor((1+nc-nr)/2) + retval = x; + for j = nc : -1 : k+nr+2 + retval((j-1)*nr+1:j*nr) = 0; + end + for j = min(nc,k+1+nr) : -1 : k+2 + retval((j-1)*nr+1:(j-1)*nr-k-1+j) = 0; + endfor + endif + endif endfunction @@ -109,4 +153,3 @@ %!error tril (); %!error tril (1, 2, 3); - |
|
|
Re: Improved triu.m and tril.m2009/10/22 Marco Caliari <marco.caliari@...>:
> Dear maintainers, > > although triu and tril are built-in functions in Matlab and then much > faster, I discovered that it is possible to improve the m files in Octave, > by eliminating the min/max evaluations inside the loops and selecting > whether to replace numbers with zeros or zeros wih numbers. > The results, for my Octave 3.2.3, are: > >> A=rand(100); >> >> i=randperm(200)(1)-100,tic,U=triu(A,i);,toc,tic,myU=mytriu(A,i);,toc,norm(U-myU) > > i = -31 > Elapsed time is 0.0221 seconds. > Elapsed time is 0.00397 seconds. > ans = 0 >> >> >> i=randperm(200)(1)-100,tic,U=triu(A,i);,toc,tic,myU=mytriu(A,i);,toc,norm(U-myU) > > i = -52 > Elapsed time is 0.0224 seconds. > Elapsed time is 0.00114 seconds. > ans = 0 >> >> >> i=randperm(200)(1)-100,tic,U=triu(A,i);,toc,tic,myU=mytriu(A,i);,toc,norm(U-myU) > > i = 22 > Elapsed time is 0.0174 seconds. > Elapsed time is 0.00495 seconds. > ans = 0 >> >> >> i=randperm(200)(1)-100,tic,U=triu(A,i);,toc,tic,myU=mytriu(A,i);,toc,norm(U-myU) > > i = 63 > Elapsed time is 0.00873 seconds. > Elapsed time is 0.00261 seconds. > ans = 0 >> >> >> i=randperm(200)(1)-100,tic,U=triu(A,i);,toc,tic,myU=mytriu(A,i);,toc,norm(U-myU) > > i = 36 > Elapsed time is 0.0146 seconds. > Elapsed time is 0.00413 seconds. > ans = 0 > > For me the speed-up is > 3. Enclosed, the patches wrt 3.2.3. > > Best regards, > > Marco Interesting. For comparison, here is my benchmark: n = 500; A = rand (n); k = 1; tril (eye (2)); # to avoid startup penalty for iter = 1:4 tic; tril (A, k); toc endfor my tip gives: Elapsed time is 0.0236731 seconds. Elapsed time is 0.0239391 seconds. Elapsed time is 0.0221 seconds. Elapsed time is 0.022387 seconds. the relevant code snippet is retval = resize (resize (x, 0), nr, nc); for j = 1 : min (nc, nr+k) nr_limit = max (1, j-k); retval (nr_limit:nr, j) = x (nr_limit:nr, j); endfor one can save a lot by doing a vectorized max once and eliminating the repeated range: retval = zeros (nr, nc, class (x)); jj = 1:min (nc, nr+k); ii = max (1, jj - k); for j = jj i = ii(j):nr; retval (i, j) = x (i, j); endfor this gives: Elapsed time is 0.0148931 seconds. Elapsed time is 0.0148301 seconds. Elapsed time is 0.0152121 seconds. Elapsed time is 0.014981 seconds. using cellslices + vertcat + reshape is faster still: m = min (nc, nr+k); ii = max (1, (1:m) - k); sl(2,:) = cellslices (x(:), ii, nr*ones (1, m)); sl(1,:) = cellslices (zeros (nr, 1), ones (1, m), ii - 1); retval = reshape (vertcat (sl{:}), nr, nc); gives Elapsed time is 0.00466204 seconds. Elapsed time is 0.00521612 seconds. Elapsed time is 0.00525379 seconds. Elapsed time is 0.00514317 seconds. for comparison, your patch: Elapsed time is 0.00818515 seconds. Elapsed time is 0.00864601 seconds. Elapsed time is 0.00886297 seconds. Elapsed time is 0.00826406 seconds. (quite good, compared to the loop-free code). Winner is number 3. triu can be handled similarly. Neither of these approaches works efficiently for sparse functions, though, so it should get a special code (probably using find). But anyway, I think implementing triu/tril/vech as compiled functions should be trivial and likely the fastest. Out of interest, do you depend on triu/tril efficiency in an application? regards -- RNDr. Jaroslav Hajek computing expert & GNU Octave developer Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic url: www.highegg.matfyz.cz |
|
|
Re: Improved triu.m and tril.m> using cellslices + vertcat + reshape is faster still:
> > m = min (nc, nr+k); > ii = max (1, (1:m) - k); > > sl(2,:) = cellslices (x(:), ii, nr*ones (1, m)); > sl(1,:) = cellslices (zeros (nr, 1), ones (1, m), ii - 1); > > retval = reshape (vertcat (sl{:}), nr, nc); Great, even if, it does not work for me... I think the problem is related to what I get with 3.2.3 octave:1> cellslices([1,2],1,0) error: subscript indices must be either positive integers or logicals. Maybe the result should be { [1,1] = [](1x0) } Also this appears strange to me octave:1> cellslices([1,2,3],[2,3],[1,1]) error: invalid range used as index. > Out of interest, do you depend on triu/tril efficiency in an application? Not really. I had a code with fft and matrix manipulations and it was slower in Octave than in Matlab just because of a final triu. Cheers, Marco |
|
|
Re: Improved triu.m and tril.mOn Fri, Oct 23, 2009 at 9:52 AM, Marco Caliari <marco.caliari@...> wrote:
>> using cellslices + vertcat + reshape is faster still: >> >> m = min (nc, nr+k); >> ii = max (1, (1:m) - k); >> >> sl(2,:) = cellslices (x(:), ii, nr*ones (1, m)); >> sl(1,:) = cellslices (zeros (nr, 1), ones (1, m), ii - 1); >> >> retval = reshape (vertcat (sl{:}), nr, nc); > > Great, even if, it does not work for me... I think the problem is related to > what I get with 3.2.3 > > octave:1> cellslices([1,2],1,0) > error: subscript indices must be either positive integers or logicals. > Yes, this was already fixed in development tip. > Maybe the result should be > > { > [1,1] = [](1x0) > } > > Also this appears strange to me > > octave:1> cellslices([1,2,3],[2,3],[1,1]) > error: invalid range used as index. > Huh. One more bug. A fix is here: http://hg.savannah.gnu.org/hgweb/octave/rev/95ad9c2a27e2 >> Out of interest, do you depend on triu/tril efficiency in an application? > > Not really. I had a code with fft and matrix manipulations and it was slower > in Octave than in Matlab just because of a final triu. > > Cheers, > > Marco > Based on a code sent today by David Bateman, I implemented a compiled (and extended) version of triu/tril: http://hg.savannah.gnu.org/hgweb/octave/rev/b134960cea23 so you may check that out. Btw, If you want to help improve Octave's performance by measuring speed and benchmarks, I'd strongly recommend you work with the development sources. regards -- RNDr. Jaroslav Hajek computing expert & GNU Octave developer Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic url: www.highegg.matfyz.cz |
| Free embeddable forum powered by Nabble | Forum Help |