At 15:00 12/08/2009 -0400, J Robinson wrote:
>For example, if I have a text like:
> this is line one and this
> is line two.
>
>I get extracted text like
> this is line one and thisis line two
>(with no space between "this" and "is")
>
>
>Is there a way to get ps2ascii to leave some whitespace in the output
>text where the line broke? Or is this a bug in ps2ascii? Thanks for any
>assistance or input on this issue, it's kind of messing up my app! Let me
>know if you need an example doc or more data.
You need to understand that the way the PostScript represents your data
does not need to bear any relation to the way it is laid out in your document.
The two lines you quote may follow each other in the document, or they may
be interspersed with other chunks of text images and linework. In addition
the text you see may not be represented as text, and may well not be
represented as two strings. Justification often causes what would appear to
be strings of text to be broken into multiple strings, or individual glyphs.
Now clearly you do have text, because ps2ascii is extracting it. The
PostScript program which performs the work (ps2ascii.ps) has the following
comment:
----------------------------------------------------------------
% $Id: ps2ascii.ps 6300 2005-12-28 19:56:24Z giles $
% Extract the ASCII text from a PostScript file. Nothing is displayed.
% Instead, ASCII information is written to stdout. The idea is similar to
% Glenn Reid's `distillery', only a lot more simple-minded, and less robust.
% If SIMPLE is defined, just the text is written, with a guess at line
% breaks and word spacing. If SIMPLE is not defined, lines are written
% to stdout as follows:
----------------------------------------------------------------
So it should attempt to detect text on different lines. Without seeing your
PostScript file I can't even guess at why it is defeating the heuristics in
ps2ascii.ps. If you want someone to look into this you will need to open a
bug report at:
http://bugs.ghostscript.com/enter_bug.cgi Ken
_______________________________________________
gs-devel mailing list
gs-devel@...
http://www.ghostscript.com/mailman/listinfo/gs-devel