|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
new file, oct-mem.hIn the new file liboctave/oct-mem.h:
// Fill by value, with a check for zero. This boils down to memset if value is // a POD zero. template <class T> inline void octave_fill (octave_idx_type n, const T& value, T *dest) { std::fill_n (dest, n, value); } template <class T> inline bool octave_fill_iszero (const T& value) { return value == T(); } template <class T> inline bool octave_fill_iszero (const std::complex<T>& value) { return value.real () == T() && value.imag () == T(); } template <class T> inline bool octave_fill_iszero (const octave_int<T>& value) { return value.value () == T(); } #define DEFINE_POD_FILL(T) \ inline void octave_fill (octave_idx_type n, const T& value, T *dest) \ { \ if (octave_fill_iszero (value)) \ std::memset (dest, 0, n * sizeof (T)); \ else \ std::fill_n (dest, n, value); \ } These rely on zero-valued floating point numbers having all bits zero, which is not guaranteed by C/C++. But it is guaranteed by the IEEE 754 format. I don't think it is a bad thing to require IEEE 754 (many things in Octave won't work properly without IEEE floating point math), but maybe we should state that assumption clearly with a configure test? Oh, OK, this requirement is more or less enforced now in octave_ieee_init. So maybe this is OK as it is, though I guess I would prefer to have a comment stating the assumption here, and perhaps also an easy way to disable this optimization if someone wanted to experiment with Octave on a system with a different floating point format. // Uninitialized allocation. Will not initialize memory for complex and octave_int. // Memory allocated by octave_new should be freed by octave_delete. template <class T> inline T *octave_new (octave_idx_type n) { return new T[n]; } template <class T> inline void octave_delete (T *ptr) { delete [] ptr; } #define DEFINE_POD_NEW_DELETE(T) \ template <> \ inline T *octave_new<T > (octave_idx_type n) \ { return reinterpret_cast<T *> (new char[n * sizeof (T)]); } \ template <> \ inline void octave_delete<T > (T *ptr) \ { delete [] reinterpret_cast<char *> (ptr); } Maybe a better name for this function would be "uninitialized_new" or "no_ctor_new" or something similar that states more clearly what the intent is? Otherwise, I think it will be easy to confuse them as just being wrappers around new/delete. jwe |
|
|
|
|
|
Re: new file, oct-mem.hOn Tue, Nov 3, 2009 at 9:33 PM, John W. Eaton <jwe@...> wrote:
> In the new file liboctave/oct-mem.h: > > // Fill by value, with a check for zero. This boils down to memset if value is > // a POD zero. > template <class T> > inline void octave_fill (octave_idx_type n, const T& value, T *dest) > { std::fill_n (dest, n, value); } > > template <class T> > inline bool octave_fill_iszero (const T& value) > { return value == T(); } > > template <class T> > inline bool octave_fill_iszero (const std::complex<T>& value) > { return value.real () == T() && value.imag () == T(); } > > template <class T> > inline bool octave_fill_iszero (const octave_int<T>& value) > { return value.value () == T(); } > > #define DEFINE_POD_FILL(T) \ > inline void octave_fill (octave_idx_type n, const T& value, T *dest) \ > { \ > if (octave_fill_iszero (value)) \ > std::memset (dest, 0, n * sizeof (T)); \ > else \ > std::fill_n (dest, n, value); \ > } > > These rely on zero-valued floating point numbers having all bits zero, > which is not guaranteed by C/C++. But it is guaranteed by the IEEE > 754 format. I don't think it is a bad thing to require IEEE 754 (many > things in Octave won't work properly without IEEE floating point > math), but maybe we should state that assumption clearly with a > configure test? Oh, OK, this requirement is more or less enforced now > in octave_ieee_init. So maybe this is OK as it is, though I guess I > would prefer to have a comment stating the assumption here, and perhaps > also an easy way to disable this optimization if someone wanted to > experiment with Octave on a system with a different floating point > format. > > // Uninitialized allocation. Will not initialize memory for complex and octave_int. > // Memory allocated by octave_new should be freed by octave_delete. > template <class T> > inline T *octave_new (octave_idx_type n) > { return new T[n]; } > template <class T> > inline void octave_delete (T *ptr) > { delete [] ptr; } > > #define DEFINE_POD_NEW_DELETE(T) \ > template <> \ > inline T *octave_new<T > (octave_idx_type n) \ > { return reinterpret_cast<T *> (new char[n * sizeof (T)]); } \ > template <> \ > inline void octave_delete<T > (T *ptr) \ > { delete [] reinterpret_cast<char *> (ptr); } > > Maybe a better name for this function would be "uninitialized_new" or > "no_ctor_new" or something similar that states more clearly what the > intent is? Otherwise, I think it will be easy to confuse them as just > being wrappers around new/delete. > > jwe > OK, I renamed the functions to more descriptive names: no_ctor_new, no_ctor_delete, copy_or_memcpy, fill_or_memset. I also changed the test for zero to use reinterpret_cast to a suitable unsigned integer type, so that it is does not actually rely on IEEE, even though it may be unnecessary. Here, it makes good sense because what one actually wants to test for is whether the fill-in value has zero memory pattern (which is typically true when arrays are resized). These changes speed up some indexing, indexed assignment and permuting for integer & complex types: memcpy is used instead of plain loops and memory is not uselessly zeroed after allocation. For single & double real matrices, sometimes no speed-up is visible, sometimes some 30%. The C++ standard library supplied with GCC optimizes std::copy to memmove for POD types. So it appears that using memmove to do a non-overlapping memory copy is sometimes equally fast as memcpy, but sometimes slower. It seems really interesting. In any case, indexing is somewhat more efficient again... -- RNDr. Jaroslav Hajek computing expert & GNU Octave developer Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic url: www.highegg.matfyz.cz |
|
|
Re: new file, oct-mem.hOn 4-Nov-2009, Rik wrote:
| Isn't the configure check preferable in that it catches the assumption | at compile time rather than run time? It would be annoying to spend 45 | minutes compiling the sources only to have octave complain and demand to | be re-compiled with the correct IEEE flags. But, perhaps I'm not | understanding the situation. I checked in the following change. http://hg.savannah.gnu.org/hgweb/octave/rev/bb70d16cca3b jwe |
| Free embeddable forum powered by Nabble | Forum Help |