|
View:
New views
8 Messages
—
Rating Filter:
Alert me
|
|
|
FYI: nth_elementhi all,
Octave has just got a new function: nth_element: http://hg.savannah.gnu.org/hgweb/octave/rev/aea3a3a950e1 the docstring: -- Built-in Function: nth_element (X, N) -- Built-in Function: nth_element (X, N, DIM) Select the n-th smallest element of a vector, using the ordering defined by `sort'. In other words, the result is equivalent to `sort(X)(N)'. N can also be a contiguous range, either ascending `l:u' or descending `u:-1:l', in which case a range of elements is returned. If X is an array, `nth_element' operates along the dimension defined by DIM, or the first non-singleton dimension if DIM is not given. nth_element encapsulates the C++ STL algorithms nth_element and partial_sort. On average, the complexity of the operation is O(M*log(K)), where `M = size(X, DIM)' and `K = length (N)'. This function is intended for cases where the ratio K/M is small; otherwise, it may be better to use `sort'. See also: sort, min, max In short, it allows extracting a small portion of sort(X) without actually doing the full sort. This is sometimes useful for statistics when computing quantiles and the like... statisticians can surely tell better. Maybe it can be used to boost some existing functions. As an example, it can be readily used to speed up the "median" function: http://hg.savannah.gnu.org/hgweb/octave/rev/b7b89061bd0e the following benchmark demonstrates the speed-up: for n = [2000, 2001] a = rand (n); disp (n); tic; median (a(:)); toc tic; median (a, 1); toc tic; median (a, 2); toc endfor using a recent tip, Core 2 Duo @ 2.83 GHz, g++ -O3 -march=native, I get: 2000 Elapsed time is 0.62825 seconds. Elapsed time is 0.34634 seconds. Elapsed time is 0.40085 seconds. 2001 Elapsed time is 0.62713 seconds. Elapsed time is 0.34236 seconds. Elapsed time is 0.39419 seconds. with the new patch, I get 2000 Elapsed time is 0.061557 seconds. Elapsed time is 0.049 seconds. Elapsed time is 0.074331 seconds. 2001 Elapsed time is 0.047512 seconds. Elapsed time is 0.047687 seconds. Elapsed time is 0.075987 seconds. Contributing tests would be warmly welcome. There's a number of corner cases, esp. if NaNs come into play, so I expect bugs to be still there. enjoy! -- RNDr. Jaroslav Hajek computing expert & GNU Octave developer Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic url: www.highegg.matfyz.cz |
|
|
Re: FYI: nth_elementOn Thu, Oct 15, 2009 at 9:15 AM, Jaroslav Hajek <highegg@...> wrote:
> hi all, > > Octave has just got a new function: nth_element: > http://hg.savannah.gnu.org/hgweb/octave/rev/aea3a3a950e1 > > the docstring: > > -- Built-in Function: nth_element (X, N) > -- Built-in Function: nth_element (X, N, DIM) > Select the n-th smallest element of a vector, using the ordering > defined by `sort'. In other words, the result is equivalent to > `sort(X)(N)'. N can also be a contiguous range, either ascending > `l:u' or descending `u:-1:l', in which case a range of elements is > returned. If X is an array, `nth_element' operates along the > dimension defined by DIM, or the first non-singleton dimension if > DIM is not given. > > nth_element encapsulates the C++ STL algorithms nth_element and > partial_sort. On average, the complexity of the operation is > O(M*log(K)), where `M = size(X, DIM)' and `K = length (N)'. This > function is intended for cases where the ratio K/M is small; > otherwise, it may be better to use `sort'. > > See also: sort, min, max > > In short, it allows extracting a small portion of sort(X) without > actually doing the full sort. > This is sometimes useful for statistics when computing quantiles and > the like... statisticians can surely tell better. > > Maybe it can be used to boost some existing functions. As an example, > it can be readily used to speed up the "median" function: > http://hg.savannah.gnu.org/hgweb/octave/rev/b7b89061bd0e > Shai |
|
|
Re: FYI: nth_elementOn Thu, Oct 15, 2009 at 9:56 AM, Shai Ayal <shaiay@...> wrote:
> On Thu, Oct 15, 2009 at 9:15 AM, Jaroslav Hajek <highegg@...> wrote: >> hi all, >> >> Octave has just got a new function: nth_element: >> http://hg.savannah.gnu.org/hgweb/octave/rev/aea3a3a950e1 >> >> the docstring: >> >> -- Built-in Function: nth_element (X, N) >> -- Built-in Function: nth_element (X, N, DIM) >> Select the n-th smallest element of a vector, using the ordering >> defined by `sort'. In other words, the result is equivalent to >> `sort(X)(N)'. N can also be a contiguous range, either ascending >> `l:u' or descending `u:-1:l', in which case a range of elements is >> returned. If X is an array, `nth_element' operates along the >> dimension defined by DIM, or the first non-singleton dimension if >> DIM is not given. >> >> nth_element encapsulates the C++ STL algorithms nth_element and >> partial_sort. On average, the complexity of the operation is >> O(M*log(K)), where `M = size(X, DIM)' and `K = length (N)'. This >> function is intended for cases where the ratio K/M is small; >> otherwise, it may be better to use `sort'. >> >> See also: sort, min, max >> >> In short, it allows extracting a small portion of sort(X) without >> actually doing the full sort. >> This is sometimes useful for statistics when computing quantiles and >> the like... statisticians can surely tell better. >> >> Maybe it can be used to boost some existing functions. As an example, >> it can be readily used to speed up the "median" function: >> http://hg.savannah.gnu.org/hgweb/octave/rev/b7b89061bd0e >> > It can also help hist quite a lot. > I don't see how. Can you clarify? -- RNDr. Jaroslav Hajek computing expert & GNU Octave developer Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic url: www.highegg.matfyz.cz |
|
|
Re: FYI: nth_elementOn Thu, Oct 15, 2009 at 12:36 PM, Jaroslav Hajek <highegg@...> wrote:
> On Thu, Oct 15, 2009 at 9:56 AM, Shai Ayal <shaiay@...> wrote: >> On Thu, Oct 15, 2009 at 9:15 AM, Jaroslav Hajek <highegg@...> wrote: >>> hi all, >>> >>> Octave has just got a new function: nth_element: >>> http://hg.savannah.gnu.org/hgweb/octave/rev/aea3a3a950e1 >>> >>> the docstring: >>> >>> -- Built-in Function: nth_element (X, N) >>> -- Built-in Function: nth_element (X, N, DIM) >>> Select the n-th smallest element of a vector, using the ordering >>> defined by `sort'. In other words, the result is equivalent to >>> `sort(X)(N)'. N can also be a contiguous range, either ascending >>> `l:u' or descending `u:-1:l', in which case a range of elements is >>> returned. If X is an array, `nth_element' operates along the >>> dimension defined by DIM, or the first non-singleton dimension if >>> DIM is not given. >>> >>> nth_element encapsulates the C++ STL algorithms nth_element and >>> partial_sort. On average, the complexity of the operation is >>> O(M*log(K)), where `M = size(X, DIM)' and `K = length (N)'. This >>> function is intended for cases where the ratio K/M is small; >>> otherwise, it may be better to use `sort'. >>> >>> See also: sort, min, max >>> >>> In short, it allows extracting a small portion of sort(X) without >>> actually doing the full sort. >>> This is sometimes useful for statistics when computing quantiles and >>> the like... statisticians can surely tell better. >>> >>> Maybe it can be used to boost some existing functions. As an example, >>> it can be readily used to speed up the "median" function: >>> http://hg.savannah.gnu.org/hgweb/octave/rev/b7b89061bd0e >>> >> It can also help hist quite a lot. >> > > I don't see how. Can you clarify? You are right -- I was confused. hist needs the number of elements smaller than a given value (and uses sort to find out), while nth_element gives the value of the nth element. Writing hist in c++ is not hard and would really help since it uses (M+K)log(M+K) sort in order to avoid a M*K loop just because looping is slow in the scripting language. One of these days I'll get down to doing that ... Shai |
|
|
Re: FYI: nth_elementOn Thu, Oct 15, 2009 at 2:22 PM, Shai Ayal <shaiay@...> wrote:
> On Thu, Oct 15, 2009 at 12:36 PM, Jaroslav Hajek <highegg@...> wrote: >> On Thu, Oct 15, 2009 at 9:56 AM, Shai Ayal <shaiay@...> wrote: >>> On Thu, Oct 15, 2009 at 9:15 AM, Jaroslav Hajek <highegg@...> wrote: >>>> hi all, >>>> >>>> Octave has just got a new function: nth_element: >>>> http://hg.savannah.gnu.org/hgweb/octave/rev/aea3a3a950e1 >>>> >>>> the docstring: >>>> >>>> -- Built-in Function: nth_element (X, N) >>>> -- Built-in Function: nth_element (X, N, DIM) >>>> Select the n-th smallest element of a vector, using the ordering >>>> defined by `sort'. In other words, the result is equivalent to >>>> `sort(X)(N)'. N can also be a contiguous range, either ascending >>>> `l:u' or descending `u:-1:l', in which case a range of elements is >>>> returned. If X is an array, `nth_element' operates along the >>>> dimension defined by DIM, or the first non-singleton dimension if >>>> DIM is not given. >>>> >>>> nth_element encapsulates the C++ STL algorithms nth_element and >>>> partial_sort. On average, the complexity of the operation is >>>> O(M*log(K)), where `M = size(X, DIM)' and `K = length (N)'. This >>>> function is intended for cases where the ratio K/M is small; >>>> otherwise, it may be better to use `sort'. >>>> >>>> See also: sort, min, max >>>> >>>> In short, it allows extracting a small portion of sort(X) without >>>> actually doing the full sort. >>>> This is sometimes useful for statistics when computing quantiles and >>>> the like... statisticians can surely tell better. >>>> >>>> Maybe it can be used to boost some existing functions. As an example, >>>> it can be readily used to speed up the "median" function: >>>> http://hg.savannah.gnu.org/hgweb/octave/rev/b7b89061bd0e >>>> >>> It can also help hist quite a lot. >>> >> >> I don't see how. Can you clarify? > > You are right -- I was confused. hist needs the number of elements > smaller than a given value (and uses sort to find out), while > nth_element gives the value of the nth element. > Writing hist in c++ is not hard and would really help since it uses > (M+K)log(M+K) sort in order to avoid a M*K loop just because looping > is slow in the scripting language. One of these days I'll get down to > doing that ... > > Shai > Why not just use a binary search, via `lookup'? That would be O(M*log(K) + K) (M the number of data and K the number of bins). Look what histc does... maybe hist can even simply call histc. hth -- RNDr. Jaroslav Hajek computing expert & GNU Octave developer Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic url: www.highegg.matfyz.cz |
|
|
Re: FYI: nth_elementOn Thu, Oct 15, 2009 at 2:31 PM, Jaroslav Hajek <highegg@...> wrote:
> On Thu, Oct 15, 2009 at 2:22 PM, Shai Ayal <shaiay@...> wrote: >> On Thu, Oct 15, 2009 at 12:36 PM, Jaroslav Hajek <highegg@...> wrote: >>> On Thu, Oct 15, 2009 at 9:56 AM, Shai Ayal <shaiay@...> wrote: >>>> On Thu, Oct 15, 2009 at 9:15 AM, Jaroslav Hajek <highegg@...> wrote: >>>>> hi all, >>>>> >>>>> Octave has just got a new function: nth_element: >>>>> http://hg.savannah.gnu.org/hgweb/octave/rev/aea3a3a950e1 >>>>> >>>>> the docstring: >>>>> >>>>> -- Built-in Function: nth_element (X, N) >>>>> -- Built-in Function: nth_element (X, N, DIM) >>>>> Select the n-th smallest element of a vector, using the ordering >>>>> defined by `sort'. In other words, the result is equivalent to >>>>> `sort(X)(N)'. N can also be a contiguous range, either ascending >>>>> `l:u' or descending `u:-1:l', in which case a range of elements is >>>>> returned. If X is an array, `nth_element' operates along the >>>>> dimension defined by DIM, or the first non-singleton dimension if >>>>> DIM is not given. >>>>> >>>>> nth_element encapsulates the C++ STL algorithms nth_element and >>>>> partial_sort. On average, the complexity of the operation is >>>>> O(M*log(K)), where `M = size(X, DIM)' and `K = length (N)'. This >>>>> function is intended for cases where the ratio K/M is small; >>>>> otherwise, it may be better to use `sort'. >>>>> >>>>> See also: sort, min, max >>>>> >>>>> In short, it allows extracting a small portion of sort(X) without >>>>> actually doing the full sort. >>>>> This is sometimes useful for statistics when computing quantiles and >>>>> the like... statisticians can surely tell better. >>>>> >>>>> Maybe it can be used to boost some existing functions. As an example, >>>>> it can be readily used to speed up the "median" function: >>>>> http://hg.savannah.gnu.org/hgweb/octave/rev/b7b89061bd0e >>>>> >>>> It can also help hist quite a lot. >>>> >>> >>> I don't see how. Can you clarify? >> >> You are right -- I was confused. hist needs the number of elements >> smaller than a given value (and uses sort to find out), while >> nth_element gives the value of the nth element. >> Writing hist in c++ is not hard and would really help since it uses >> (M+K)log(M+K) sort in order to avoid a M*K loop just because looping >> is slow in the scripting language. One of these days I'll get down to >> doing that ... >> >> Shai >> > > Why not just use a binary search, via `lookup'? That would be > O(M*log(K) + K) (M the number of data and K the number of bins). > Look what histc does... maybe hist can even simply call histc. them is that hist plots the results Shai |
|
|
Re: FYI: nth_element2009/10/15 Jaroslav Hajek <highegg@...>:
> nth_element encapsulates the C++ STL algorithms nth_element and > partial_sort. Silly and pointless nitpickery: would you please refer to it as the C++ standard library instead of STL? Although nth_element and partial_sort are indeed in the STL that SGI contribute to the C++ stdlib and the GNU implementation is based on SGI's contributions, the term "STL", 11 years after being standardised and very well implemented, still carries connotations of "strange, bloated, foreign, and nonstandard part of C++". For marketing purposes, I think it's better to say that these are Octave functions that are wrapping standard C++ functions. That's all. ;-) |
|
|
Re: FYI: nth_element2009/10/17 Jordi Gutiérrez Hermoso <jordigh@...>:
> 2009/10/15 Jaroslav Hajek <highegg@...>: >> nth_element encapsulates the C++ STL algorithms nth_element and >> partial_sort. > > Silly and pointless nitpickery: would you please refer to it as the > C++ standard library instead of STL? Although nth_element and > partial_sort are indeed in the STL that SGI contribute to the C++ > stdlib and the GNU implementation is based on SGI's contributions, the > term "STL", 11 years after being standardised and very well > implemented, still carries connotations of "strange, bloated, foreign, > and nonstandard part of C++". > Sorry, I was not aware of the distinction. For me, STL was simply the containers+algorithms part of C++ standard library. > For marketing purposes, I think it's better to say that these are > Octave functions that are wrapping standard C++ functions. That's all. > ;-) > Marketing? I hope not :) -- RNDr. Jaroslav Hajek computing expert & GNU Octave developer Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic url: www.highegg.matfyz.cz |
| Free embeddable forum powered by Nabble | Forum Help |