what's a fast way to read ascii data?

View: New views
2 Messages — Rating Filter:   Alert me  

what's a fast way to read ascii data?

by Dr. Johannes Zellner :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I'd like to read ascii data like this:

1 2 3 4
5 6 7 8
....
some text
9 10 11 12
13 14 15 16
...
some more text


so esentially, I've some lines and columns of numeric data but sometimes there's also a line of text.

I'm interested mainly in reading the numeric data.

If I do it like this:

       istream = fopen(fname, 'r');
       while -1 != (vstr = fgets(istream))
           items = str2double(vstr);
           if ~isnan(items) && length(items) == 4
                      ....

it is really slow.

Any hints to do this faster?

Johannes


_______________________________________________
Help-octave mailing list
Help-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave

Re: what's a fast way to read ascii data?

by Judd Storrs :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Well, I don't know about your particular files, but I've often been able to get a file reader working acceptably by reading the entire file as a character array and then tokenizing the entire string at once using the string processing functions and then processing the tokens in a while loop. Here's an excerpt from some code I've written:

pid = fopen( path, "r" );
tok = regexp( fread(pid, "*char")', "\"([^\"]+)\"|([^\\s]+)", "match" );
fclose(pid);

For the files this code reads, the EOL's didn't matter, and the regular expression needed to pull in "strings inside quotations" as single objects.

You could probably concoct a regular expression to extract lines with four numbers from the input. Here's a regexp that will extract four integers separated by whitespace that works on the file example you posted:

fid = fopen("data.txt","r") ;
tok = regexp( fread(fid, "*char")', '\s*(\d+\s*){4}\n', "match" );
fclose( fid );

See http://www.gnu.org/software/octave/doc/interpreter/Manipulating-Strings.html for other ideas about manipulating strings.


--judd




On Thu, Jul 2, 2009 at 4:56 PM, Dr. Johannes Zellner <johannes@...> wrote:
Hi,

I'd like to read ascii data like this:

1 2 3 4
5 6 7 8
....
some text
9 10 11 12
13 14 15 16
...
some more text


so esentially, I've some lines and columns of numeric data but sometimes there's also a line of text.

I'm interested mainly in reading the numeric data.

If I do it like this:

       istream = fopen(fname, 'r');
       while -1 != (vstr = fgets(istream))
           items = str2double(vstr);
           if ~isnan(items) && length(items) == 4
                      ....

it is really slow.

Any hints to do this faster?

Johannes


_______________________________________________
Help-octave mailing list
Help-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave



_______________________________________________
Help-octave mailing list
Help-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave