Efficient code for operating on pairwise vector diffs

View: New views
8 Messages — Rating Filter:   Alert me  

Efficient code for operating on pairwise vector diffs

by Joseph Wakeling :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello all,

Suppose that I have a set of N d-dimensional vectors, which I guess
could be represented as either a matrix,

   X = [X1 X2 ... XN]

or a cell:

   X = {X1 X2 ... XN}

I'm trying to implement a function which consists of the following steps:

  (i) calculate the pairwise vector differences Dij = Xi - Xj

  (ii) perform some function on each of the vector differences,
       Fij = F(Dij)

  (iii) Construct the averages aveFi = mean of Fij calculated over
        the subscript j.

My guess is that the best way to do this is to represent the vector
differences in a NxN cell whose (i,j)'th element is the d-dimensional
vector X(:,i)-X(:,j).  Then I can call cellfun to apply the function F
and finally the averages can be achieved by some use of mean(cell2mat())
or whatever.

Now, the challenge is how to generate the vector differences
efficiently.  I can do,

for i=1:N
    for j=1:N
        D{i,j} = X(:,i) - X(:,j)
    end
end

... but that seems very inefficient, which is exactly what I'm trying to
avoid(*).  Can anyone advise an efficient way to carry out the above?

Thanks & best wishes,

    -- Joe

(*) Motivation for this whole query: I'm editing some MATLAB code by a
colleague which we're going to release some time soon(**).  His code
works but is not very efficiently written from a MATLAB/Octave point of
view, and so he's used C extensions for some parts.

I think that the code can be made efficient without needing to use C,
which would be very nice as it would allow it to be run in Octave as
well as MATLAB.  But my implementation is foundering on implementing
the above code without costly for loops ...

(**) In answer to 'When?' and 'Why can't we see the actual code now?':
it comes down to academic confidentiality.  If it was just my work I
would happily share, but I can't presume on the part of my co-workers.
>From a publication point of view, we've done all the necessary work and
could just chuck the existing code over the wall; but if I can make sure
that the supporting information to our paper contains Octave-compatible
code, I would like to.  Hope you understand.
_______________________________________________
Help-octave mailing list
Help-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave

Re: Efficient code for operating on pairwise vector diffs

by Jaroslav Hajek-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Nov 2, 2009 at 2:21 PM, Joseph Wakeling
<joseph.wakeling@...> wrote:

> Hello all,
>
> Suppose that I have a set of N d-dimensional vectors, which I guess
> could be represented as either a matrix,
>
>   X = [X1 X2 ... XN]
>
> or a cell:
>
>   X = {X1 X2 ... XN}
>
> I'm trying to implement a function which consists of the following steps:
>
>  (i) calculate the pairwise vector differences Dij = Xi - Xj
>
>  (ii) perform some function on each of the vector differences,
>       Fij = F(Dij)
>
>  (iii) Construct the averages aveFi = mean of Fij calculated over
>        the subscript j.
>
> My guess is that the best way to do this is to represent the vector
> differences in a NxN cell whose (i,j)'th element is the d-dimensional
> vector X(:,i)-X(:,j).  Then I can call cellfun to apply the function F
> and finally the averages can be achieved by some use of mean(cell2mat())
> or whatever.
>
> Now, the challenge is how to generate the vector differences
> efficiently.  I can do,
>
> for i=1:N
>    for j=1:N
>        D{i,j} = X(:,i) - X(:,j)
>    end
> end
>
> ... but that seems very inefficient, which is exactly what I'm trying to
> avoid(*).  Can anyone advise an efficient way to carry out the above?
>
> Thanks & best wishes,
>
>    -- Joe


If X is a DxN matrix, you can get a dxNxN array of pairwise
differences using, for instance using spread indexing

D = X(:,:,ones (1, N)) - reshape (X, d, 1, N)(:,ones (1, N), :);

or using bsxfun

D = bsxfun (@minus, X, reshape (X, d, 1, N));

the relative efficiency will depend on your version of Octave (bsxfun
wins if you use the bleeding edge sources). Then,

D = squeeze (nu2mcell (D, 1));

converts this to an NxN cell array of dx1 vectors. Note that unless
your mapper function is fast enough (built-in?), this isn't going to
win you much, because your function will still be called NxN times.

If F is a fixed mapping and relatively simple, there may be a better
way to do the trick in a few calls. See also
http://octave.sourceforge.net/doc/f/pdist.html

> (*) Motivation for this whole query: I'm editing some MATLAB code by a
> colleague which we're going to release some time soon(**).  His code
> works but is not very efficiently written from a MATLAB/Octave point of
> view, and so he's used C extensions for some parts.
>
> I think that the code can be made efficient without needing to use C,
> which would be very nice as it would allow it to be run in Octave as
> well as MATLAB.  But my implementation is foundering on implementing
> the above code without costly for loops ...
>

See the above comment. Note that Octave also has a C++ extension as an option.

> (**) In answer to 'When?' and 'Why can't we see the actual code now?':
> it comes down to academic confidentiality.  If it was just my work I
> would happily share, but I can't presume on the part of my co-workers.

Code is always better if you want good answers. You can only extract a
small (but working!) skeleton. Unless your co-workers are idiots, they
won't bark at you for showing a few lines, especially if you're trying
to improve your common work.

hth

--
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

_______________________________________________
Help-octave mailing list
Help-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave

Re: Efficient code for operating on pairwise vector diffs

by Joseph Wakeling :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jaroslav Hajek wrote:

> If X is a DxN matrix, you can get a dxNxN array of pairwise
> differences using, for instance using spread indexing
>
> D = X(:,:,ones (1, N)) - reshape (X, d, 1, N)(:,ones (1, N), :);
>
> or using bsxfun
>
> D = bsxfun (@minus, X, reshape (X, d, 1, N));
>
> the relative efficiency will depend on your version of Octave (bsxfun
> wins if you use the bleeding edge sources).

That's brilliant -- thank you very much!  Seems to provide exactly what
I need.  I will play and see where this goes.

> converts this to an NxN cell array of dx1 vectors. Note that unless
> your mapper function is fast enough (built-in?), this isn't going to
> win you much, because your function will still be called NxN times.

I'm hoping it will win a fair amount relative to the code I inherited --
NxN is an inevitability, but I think the actual implementation was
surely less efficient than that ...

> If F is a fixed mapping and relatively simple, there may be a better
> way to do the trick in a few calls. See also
> http://octave.sourceforge.net/doc/f/pdist.html

The code has to be both MATLAB- and Octave-compatible, so this is
probably out unless we copy from octave-forge.  Probably simpler to
stick with functions that are built-in with both programs.

> See the above comment. Note that Octave also has a C++ extension as an option.

Sure -- but it's preferable to remove techy problems like compiling
extensions from users' lives. :-)

> Code is always better if you want good answers. You can only extract a
> small (but working!) skeleton. Unless your co-workers are idiots, they
> won't bark at you for showing a few lines, especially if you're trying
> to improve your common work.

_I_ know that, but some people (and some fields:-) are paranoid about
such things.  I'll have a word with people and see what we can do.  I'm
trying to persuade them to accept a preprint release of the article; if
that goes ahead maybe we could at the same time present the code for
review to the Octave user community, and we could add a little note of
acknowledgement in the final published paper.  I would certainly like to
see the code end up in octave-forge at some point.

Thanks again for your detailed and helpful remarks,

    -- Joe
_______________________________________________
Help-octave mailing list
Help-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave

Re: Efficient code for operating on pairwise vector diffs

by Joseph Wakeling :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jaroslav Hajek wrote:
> Code is always better if you want good answers. You can only extract a
> small (but working!) skeleton. Unless your co-workers are idiots, they
> won't bark at you for showing a few lines, especially if you're trying
> to improve your common work.

Further to previous comments -- I wound up implementing pairwisediff()
as a function in its own right (attached).  This actually uses both your
suggested solutions: the version of MATLAB I was using to test MATLAB
compatibility didn't have bsxfun included, so I put in place this check
to use either bsxfun or the alternative.

Looking at this it occurs to me that it would be very easy to modify it
to allow pairwise diffs of either rows or columns (just add a DIM
parameter), and that it would make a nice addition to octave-forge.
This particular code has bugger-all to do with any academic worries, and
could be useful for a lot of people, so would be interested on having
comments on what needs to be added to it to make it a worthwhile submission.

Thanks & best wishes,

    -- Joe

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%                                                                             %
% PAIRWISE VECTOR DIFFERENCE                                                  %
%  (c) 2009 Joseph Rushton Wakeling                                           %
%                                                                             %
%  Special thanks to Jaroslav Hajek for suggesting these implementations      %
%  for efficient pairwise vector diff calculation.                            %
%                                                                             %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% This function calculates the pairwise differences between the columns of a
% given matrix X, returning a 3D matrix D such that
%
%     D(:,i,j) = X(:,i) - X(:,j)
%
function D = pairwisediff(X)

        if(exist('bsxfun')~=0)
                D = bsxfun( @minus, X, reshape(X,size(X,1),1,size(X,2)) );
        else
                XX = reshape(X,size(X,1),1,size(X,2));
                D = X(:, :, ones(1,size(X,2)) ) - XX(:, ones(1,size(X,2)), :);
        end

end

_______________________________________________
Help-octave mailing list
Help-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave

Re: Efficient code for operating on pairwise vector diffs

by Jaroslav Hajek-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Nov 4, 2009 at 9:01 PM, Joseph Wakeling
<joseph.wakeling@...> wrote:

> Jaroslav Hajek wrote:
>> Code is always better if you want good answers. You can only extract a
>> small (but working!) skeleton. Unless your co-workers are idiots, they
>> won't bark at you for showing a few lines, especially if you're trying
>> to improve your common work.
>
> Further to previous comments -- I wound up implementing pairwisediff()
> as a function in its own right (attached).  This actually uses both your
> suggested solutions: the version of MATLAB I was using to test MATLAB
> compatibility didn't have bsxfun included, so I put in place this check
> to use either bsxfun or the alternative.
>
> Looking at this it occurs to me that it would be very easy to modify it
> to allow pairwise diffs of either rows or columns (just add a DIM
> parameter), and that it would make a nice addition to octave-forge.
> This particular code has bugger-all to do with any academic worries, and
> could be useful for a lot of people, so would be interested on having
> comments on what needs to be added to it to make it a worthwhile submission.
>

Hi,
sorry for the late reply. For an OctaveForge file, I think you should
write an Octave-like inline docstring. Error checks (number&type of
arguments) would also be nice. The dim argument also sounds good.
If you target just Octave, the exist("bsxfun") check is pretty unnecessary.

hth

--
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

_______________________________________________
Help-octave mailing list
Help-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave

Re: Efficient code for operating on pairwise vector diffs

by Joseph Wakeling :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jaroslav Hajek wrote:
> Hi,
> sorry for the late reply. For an OctaveForge file, I think you should
> write an Octave-like inline docstring. Error checks (number&type of
> arguments) would also be nice. The dim argument also sounds good.
> If you target just Octave, the exist("bsxfun") check is pretty unnecessary.

Sure, that sounds good.

Just on a personal level, are you happy with the author credit in the
code?  It felt slightly cheeky saying '(c) me' and only thanking you --
it seemed OK in the original longer piece of code I wrote using bsxfun
on one line, but less so in a function where your suggestions are the
majority of the working lines :-)

Brief overview of what I'd like to do with the complete code this is
part of.  We're preparing code to be both Octave- and MATLAB-compatible
(which is why the bsxfun check is in there, only the most recent MATLAB
has bsxfun).  I'll probably put it onto Launchpad or GitHub as something
stand-alone to be maintained, with a permissive but GPL-compatible
license (BSD, I guess).  Then octave-forge stuff can be derived from it.
 Does this sound OK?

Again, sorry for not being more forthcoming with actual code examples
right now, but it will get there. :-)

Best wishes,

    -- Joe
_______________________________________________
Help-octave mailing list
Help-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave

Re: Efficient code for operating on pairwise vector diffs

by Jaroslav Hajek-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Nov 8, 2009 at 10:41 PM, Joseph Wakeling
<joseph.wakeling@...> wrote:

> Jaroslav Hajek wrote:
>> Hi,
>> sorry for the late reply. For an OctaveForge file, I think you should
>> write an Octave-like inline docstring. Error checks (number&type of
>> arguments) would also be nice. The dim argument also sounds good.
>> If you target just Octave, the exist("bsxfun") check is pretty unnecessary.
>
> Sure, that sounds good.
>
> Just on a personal level, are you happy with the author credit in the
> code?  It felt slightly cheeky saying '(c) me' and only thanking you --
> it seemed OK in the original longer piece of code I wrote using bsxfun
> on one line, but less so in a function where your suggestions are the
> majority of the working lines :-)

I would never claim copyright for two lines of code. And by addressing
the suggestions above you'll get more code.

> Brief overview of what I'd like to do with the complete code this is
> part of.  We're preparing code to be both Octave- and MATLAB-compatible
> (which is why the bsxfun check is in there, only the most recent MATLAB
> has bsxfun).  I'll probably put it onto Launchpad or GitHub as something
> stand-alone to be maintained, with a permissive but GPL-compatible
> license (BSD, I guess).  Then octave-forge stuff can be derived from it.
>  Does this sound OK?
>

I think so. Of course there's no need to include your code in
OctaveForge (implying more maintenance if the primary source is
elsewhere), it's just that OctaveForge is the place where people look
for Octave extensions first.

>
> Again, sorry for not being more forthcoming with actual code examples
> right now, but it will get there. :-)
>
> Best wishes,
>
>    -- Joe
>

regards

--
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

_______________________________________________
Help-octave mailing list
Help-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave

Re: Efficient code for operating on pairwise vector diffs

by Joseph Wakeling :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jaroslav Hajek wrote:
> I think so. Of course there's no need to include your code in
> OctaveForge (implying more maintenance if the primary source is
> elsewhere), it's just that OctaveForge is the place where people look
> for Octave extensions first.

No, I definitely want it in octave-forge.  I've never got over the
pleasure of discovering that information-theoretic functions I
contributed had made their way into the official packages of the Linux
distribution I use. :-)  I just have to be conscientious to all the poor
scientists out there who have never used anything other than MATLAB ...
_______________________________________________
Help-octave mailing list
Help-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave