Parsing progressively...

View: New views
12 Messages — Rating Filter:   Alert me  

Parsing progressively...

by Etienne Philip Pretorius :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello list,

Just a quick question about spirit. If I was to use it to create and
parse xml content, will it parse of the content continuously or could
one build up a SAX like parser with this framework. Something akin to
xecedes pparse or libxml parse_chunk.

More detail:
I am storing data from the network in a boost::array and I have written
a forward iterator to decode via dereferencing each utf8 byte sequence
into its unicode code point. Should I build up the passed content into
another buffer so that spirit could parse from "<"....">" or could I
pass the original boost array that will contain the "<" and then later
another parse call to get the ">" in the next iteration of the network
loop. Will spirit then recognize the start tag over the buffer break?

(does spirit keep its own buffer to track the parse with its state machine?)

Please ignore my ignorance on the spirit framework, as I am only
starting to look at it and it seems quite a bit for me to figure out at
this time.

Thank you,
Etienne

------------------------------------------------------------------------------
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Parsing progressively...

by CARL BARRON-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jul 1, 2009, at 4:41 PM, Etienne Philip Pretorius wrote:

>
> More detail:
> I am storing data from the network in a boost::array and I have  
> written
> a forward iterator to decode via dereferencing each utf8 byte sequence
> into its unicode code point. Should I build up the passed content into
> another buffer so that spirit could parse from "<"....">" or could I
> pass the original boost array that will contain the "<" and then later
> another parse call to get the ">" in the next iteration of the network
> loop. Will spirit then recognize the start tag over the buffer break?
  Spirit has  no input buffer of its own. Spirit is not designed to do  
'stop and go' parsing.
That is parse an arbitrary,but not known to the parser,amount of  
tokens of data, save state , quit, and later restart where it left off.

    I suppose on_error<retry>(...) could be used to write an infinite  
wait loop and continue parsing after more data is available, but mean  
while it is parsing and failing possibly eating cpu time and  
preventing more input, depends on os...  provided you can provide a  
real end of info indicator so that the on_error handler could be sure  
whether it has found the real eoi or just a stop point.  HAVE NOT  
TRIED THIS....


------------------------------------------------------------------------------
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Parsing progressively...

by Hartmut Kaiser :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I think you're mixing two things here (see below).

> Just a quick question about spirit. If I was to use it to create and
> parse xml content, will it parse of the content continuously or could
> one build up a SAX like parser with this framework. Something akin to
> xecedes pparse or libxml parse_chunk.

This is certainly possible. Use semantic actions. These get called whenever
something matching your grammar has been seen in the input. See the
quickbook tool as a nice example how this can be used in practice.

> More detail:
> I am storing data from the network in a boost::array and I have written
> a forward iterator to decode via dereferencing each utf8 byte sequence
> into its unicode code point. Should I build up the passed content into
> another buffer so that spirit could parse from "<"....">" or could I
> pass the original boost array that will contain the "<" and then later
> another parse call to get the ">" in the next iteration of the network
> loop. Will spirit then recognize the start tag over the buffer break?
>
> (does spirit keep its own buffer to track the parse with its state
> machine?)

This is not possible as Spirits recursive descent parser framework stores
the current parser state on the (hardware-) stack. In order to do the
context switch in between two parse steps you need to use some co-routine
library. But this is non-trivial and requires some changes to Spirit itself.

Regards Hartmut




------------------------------------------------------------------------------
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Parsing progressively...

by Etienne Philip Pretorius :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>> More detail:
>> I am storing data from the network in a boost::array and I have written
>> a forward iterator to decode via dereferencing each utf8 byte sequence
>> into its unicode code point. Should I build up the passed content into
>> another buffer so that spirit could parse from "<"....">" or could I
>> pass the original boost array that will contain the "<" and then later
>> another parse call to get the ">" in the next iteration of the network
>> loop. Will spirit then recognize the start tag over the buffer break?
>>
>> (does spirit keep its own buffer to track the parse with its state
>> machine?)
>
> This is not possible as Spirits recursive descent parser framework stores
> the current parser state on the (hardware-) stack. In order to do the
> context switch in between two parse steps you need to use some co-routine
> library. But this is non-trivial and requires some changes to Spirit itself.
>
Thank you.

Could a feature request be made so that the Spirit Framework could be
used to parse content sent over a network connection?

Would the rest of the community find such a feature useful?

Kind Regards,
Etienne

------------------------------------------------------------------------------
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Parsing progressively...

by Felipe Magno de Almeida :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Jul 3, 2009 at 11:48 PM, Hartmut Kaiser<hartmut.kaiser@...> wrote:
> I think you're mixing two things here (see below).

[snip]

> This is not possible as Spirits recursive descent parser framework stores
> the current parser state on the (hardware-) stack. In order to do the
> context switch in between two parse steps you need to use some co-routine
> library. But this is non-trivial and requires some changes to Spirit itself.

Would it really need changes to spirit?
A non-match for something could generate a semantic action that would
call yield and later retry. This loop parser could be in a boolean
condition.
I think it is doable, though the logic would have to be all in the
parser, which could become too much burden.

> Regards Hartmut

--
Felipe Magno de Almeida

------------------------------------------------------------------------------
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Parsing progressively...

by Commander Pirx :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>> This is not possible as Spirits recursive descent parser framework stores
>> the current parser state on the (hardware-) stack. In order to do the
>> context switch in between two parse steps you need to use some co-routine
>> library. But this is non-trivial and requires some changes to Spirit
>> itself.
>>
> Thank you.
>
> Could a feature request be made so that the Spirit Framework could be
> used to parse content sent over a network connection?
>
> Would the rest of the community find such a feature useful?
>
> Kind Regards,
> Etienne

IMHO this feature would be very usefull. I personally use Spirit to parse
HTTP headers. This works fine. But currently I have to check the end of the
HTTP header "\r\n\r\n" in a special way and after detecting the end,
the parser will called. It all boils down to parse the byte stream twice!

If Spirit could store it's internal state, it would be great. The biggest
would be, to accept boost::asio::tcp::iostream as parser input.

Cmd. Pirx



------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time,
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Parsing progressively...

by Lucas Thode :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On Thu, Jul 9, 2009 at 6:18 AM, Commander Pirx <CmdPirx@...> wrote:
<<quote snipped>>

If Spirit could store it's internal state, it would be great. The biggest
would be, to accept boost::asio::tcp::iostream as parser input.

Cmd. Pirx
Actually, you can accept boost::asio::tcp::iostream as parser input by constructing an istream_iterator from it and then wrapping that istream_iterator in a multi_pass.

--Lucas


------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time,
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Parsing progressively...

by Commander Pirx :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.
 
"Lucas Thode" <ljthode@...> schrieb im Newsbeitrag news:cb3b33ef0907090705r77a86a6jacb330a72aaba001@......


On Thu, Jul 9, 2009 at 6:18 AM, Commander Pirx <CmdPirx@...> wrote:
<<quote snipped>>

If Spirit could store it's internal state, it would be great. The biggest
would be, to accept boost::asio::tcp::iostream as parser input.

Cmd. Pirx
Actually, you can accept boost::asio::tcp::iostream as parser input by constructing an istream_iterator from it and then wrapping that istream_iterator in a multi_pass.

--Lucas
Is this also true for spirit 1.8?
Currently I work with the classic version because spirit2.x seems to be a work in progress. What I really need is a parser that could be feed byte by byte. This is the pseudo code:
 
while (receiving)
{
    char c = receive_byte();
    parser.consume( c );
}
 
Thanks for the hint.
-- Pirx

------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time,
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Parsing progressively...

by Hartmut Kaiser :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> On Thu, Jul 9, 2009 at 6:18 AM, Commander Pirx <CmdPirx@...>
> wrote:
> <<quote snipped>>
>
> If Spirit could store it's internal state, it would be great. The
> biggest would be, to accept boost::asio::tcp::iostream as parser input.
>
> Cmd. Pirx


> Actually, you can accept boost::asio::tcp::iostream as parser input by
> constructing an istream_iterator from it and then wrapping that
> istream_iterator in a multi_pass.

FWIW, using istream_iterator should be sufficient. There is no need to
additionally wrap it in a multi_pass.

Regards Hartmut



------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time,
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Parsing progressively...

by Lucas Thode :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On Thu, Jul 9, 2009 at 9:42 AM, Hartmut Kaiser <hartmut.kaiser@...> wrote:
> On Thu, Jul 9, 2009 at 6:18 AM, Commander Pirx <CmdPirx@...>
> wrote:
> <<quote snipped>>
>
> If Spirit could store it's internal state, it would be great. The
> biggest would be, to accept boost::asio::tcp::iostream as parser input.
>
> Cmd. Pirx


> Actually, you can accept boost::asio::tcp::iostream as parser input by
> constructing an istream_iterator from it and then wrapping that
> istream_iterator in a multi_pass.

FWIW, using istream_iterator should be sufficient. There is no need to
additionally wrap it in a multi_pass.

Regards Hartmut
I take it that Spirit2.x can use input iterators?

--Lucas


------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time,
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Parsing progressively...

by Hartmut Kaiser :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> > Actually, you can accept boost::asio::tcp::iostream as parser input
> by
> > constructing an istream_iterator from it and then wrapping that
> > istream_iterator in a multi_pass.
> FWIW, using istream_iterator should be sufficient. There is no need to
> additionally wrap it in a multi_pass.
>
> Regards Hartmut
> I take it that Spirit2.x can use input iterators?

Doh! My bad. You're certainly right, the iterators have to be at least
forward iterators.

Regards Hartmut




------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time,
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Parsing progressively...

by Lucas Thode :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On Thu, Jul 9, 2009 at 9:38 AM, Commander Pirx <CmdPirx@...> wrote:
<<quote snipped>>
Is this also true for spirit 1.8?
Yes it is, the requirements for iterators have not changed between spirit 1.x and spirit 2.x.

--Lucas


------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time,
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general