|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
[Octave] Improving Octave for large filesHi,
I have been looking for some time after a cool GNU project to help on and the Octave project just what I was looking for. I have been following the Octave project for sometime (looking at the code and sitting om the mailinglist) and have now decided to do some real work if possible. I have been looking in the PROJECTS file in the source and wanted to hear if anyone is working on the problem with large files that Juhana K. Kouhia talks about (I couldn't find any code in the src/load-save.cc file to indicate that)? I have a friend working on the TPIE library (http://www.madalgo.au.dk/Trac-tpie/) and thought it would fit nicely into the octave source. Does anyone have any concerns about including the TPIE library or any comments about how best to add the functionality. Cheers, Christian Brædstrup |
|
|
Re: [Octave] Improving Octave for large filesChristian Brædstrup wrote:
> I have > been looking in the PROJECTS file in the source and wanted to hear if anyone > is working on the problem with large files that Juhana K. Kouhia talks about > (I couldn't find any code in the src/load-save.cc file to indicate that)? I > have a friend working on the TPIE library ( > http://www.madalgo.au.dk/Trac-tpie/) and thought it would fit nicely into > the octave source. Does anyone have any concerns about including the TPIE > library or any comments about how best to add the functionality. > > That idea was proposed in 1994 http://old.nabble.com/Octave-question-to9226868.html#a9226868 and things have perhaps moved a bit since. I'd say the large file issues now are two fold 1) Data sets with more elements that 2^31 due to 64-bit indexing. The ability to handle such datasets is in Octave but poorly tested. The loading and saving of files for such datasets is not however tested though the HDF5 formats should be able to handle this 2) Large data sets tend to go hand in hand with large computational problems, and the parallelisation and distribution of a database across many nodes could be improved I'm sorry I don't know really what TPIE was to offer, but if as I suspect it defers reading data from a file till its needed. In this case to integrate TPIE probably means implementing user types from the ground up (right down to a reimplementation of the Array class. Is the benefit worth the cost? D. -- David Bateman dbateman@... 35 rue Gambetta +33 1 46 04 02 18 (Home) 92100 Boulogne-Billancourt FRANCE +33 6 72 01 06 33 (Mob) |
|
|
Re: [Octave] Improving Octave for large filesOkay, I didn't know the feature request was that old.
When the code is so untested then perhaps the best thing to do is to create some large files and see how Octave handes them to begin with. As far as I have been told TPIE sorts all the data from the input file and then only access the data it needs after the sort (to save memory space). The reason I suggested the library is because I know it is actively developed at a university level and that the group uses it to handle very very large sets of satelite data to do 3D terrain mapping. But if using the library involves rewriting a lot of good code it would be foolish to use it. On Wed, Nov 11, 2009 at 12:51 AM, David Bateman <dbateman@...> wrote:
|
|
|
Re: [Octave] Improving Octave for large filesChristian Brædstrup wrote:
> Okay, I didn't know the feature request was that old. > When the code is so untested then perhaps the best thing to do is to create > some large files and see how Octave handes them to begin with. > > The more people using 64-bit versions of Octave the faster the bugs will be worked out.. It takes a bit of effort to get a 64-bit build of Octave right though as the integer type for blas, lapack and other libraries needs to be 64-bit, though it can be done... > As far as I have been told TPIE sorts all the data from the input file and > then only access the data it needs after the sort (to save memory space). > The reason I suggested the library is because I know it is actively > developed at a university level and that the group uses it to handle very > very large sets of satelite data to do 3D terrain mapping. But if using the > library involves rewriting a lot of good code it would be foolish to use it. > > As I said I don't know TPIE but it sounds like one solution to its integration might be find a way to allow Array<T>::data to be replaced with an external representation given by TPIE, though doing that might be quite dangerous D. -- David Bateman dbateman@... 35 rue Gambetta +33 1 46 04 02 18 (Home) 92100 Boulogne-Billancourt FRANCE +33 6 72 01 06 33 (Mob) |
| Free embeddable forum powered by Nabble | Forum Help |