istream_iterator<char> and Spirit.lex

View: New views
7 Messages — Rating Filter:   Alert me  

istream_iterator<char> and Spirit.lex

by Jean-Francois Ostiguy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hello again -

I am trying to have my lexer read from a file using istream iterators.

So I defined:

typedef std::istream_iterator<char>                  char_it_type;
typedef lex::lexertl::token<char_it_type  >          token_type;                  
typedef lex::lexertl::actor_lexer<token_type>        lexer_type;
typedef vlasov_lexer<lexer_type>::iterator_type    iterator_type;

....

 my_lexer<lexer_type>      lexer;            
 my_grammar<iterator_type> grammar(lexer);  

....

When  the lexer is instantiated I get this error:

boost/spirit/home/lex/lexer/lexertl/iterator_tokenizer.hpp:56:
error: no match for 'operator-' in 'start_token_ - 1'

/usr/lib/gcc/x86_64-pc-linux-gnu/4.4.2/include/g++v4/bits/stl_bvector.h:179:
note: candidates are: ptrdiff_t std::operator-(const
std::_Bit_iterator_base&, const std::_Bit_iterator_base&)                                                                                                            
 
Note that if char_it_type is set to char const*, everything is fine.

What are the requirements on IteratorT in lex::lexertl::token<IteratorT> ?

-Francois
 


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: istream_iterator<char> and Spirit.lex

by OvermindDL1 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Oct 30, 2009 at 1:14 PM, Jean-Francois Ostiguy <ostiguy@...> wrote:

>
> Hello again -
>
> I am trying to have my lexer read from a file using istream iterators.
>
> So I defined:
>
> typedef std::istream_iterator<char>                  char_it_type;
> typedef lex::lexertl::token<char_it_type  >          token_type;
> typedef lex::lexertl::actor_lexer<token_type>        lexer_type;
> typedef vlasov_lexer<lexer_type>::iterator_type    iterator_type;
>
> ....
>
>  my_lexer<lexer_type>      lexer;
>  my_grammar<iterator_type> grammar(lexer);
>
> ....
>
> When  the lexer is instantiated I get this error:
>
> boost/spirit/home/lex/lexer/lexertl/iterator_tokenizer.hpp:56:
> error: no match for 'operator-' in 'start_token_ - 1'
>
> /usr/lib/gcc/x86_64-pc-linux-gnu/4.4.2/include/g++v4/bits/stl_bvector.h:179:
> note: candidates are: ptrdiff_t std::operator-(const
> std::_Bit_iterator_base&, const std::_Bit_iterator_base&)
>
> Note that if char_it_type is set to char const*, everything is fine.
>
> What are the requirements on IteratorT in lex::lexertl::token<IteratorT> ?

I believe it is the same everywhere in Spirit, you need a forward
iterator, and, well, an input iterator does not fulfill the
requirements for a forward iterator.  However, Spirit includes a
handy-dandy multi-pass wrapper that turns an input iterator into a
forward iterator.  Check the docs.  :)

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: istream_iterator<char> and Spirit.lex

by Jean-Francois Ostiguy-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

OvermindDL1 wrote:


>> What are the requirements on IteratorT in lex::lexertl::token<IteratorT>
>> ?
>
> I believe it is the same everywhere in Spirit, you need a forward
> iterator, and, well, an input iterator does not fulfill the
> requirements for a forward iterator.  However, Spirit includes a
> handy-dandy multi-pass wrapper that turns an input iterator into a
> forward iterator.  Check the docs.  :)
>

Thanks !
It took me a while to find the relevant info in the docs ... under
"Supporting Libraries / multipass iterator". Perhaps one of the
"quick start" examples should demonstrate using a stream as input.

-Francois      




------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Spirit.lex Iteratorrequirements

by Jean-Francois Ostiguy-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

OvermindDL1 wrote:


>> What are the requirements on IteratorT in lex::lexertl::token<IteratorT>
>> ?
>
> I believe it is the same everywhere in Spirit, you need a forward
> iterator, and, well, an input iterator does not fulfill the
> requirements for a forward iterator.  However, Spirit includes a
> handy-dandy multi-pass wrapper that turns an input iterator into a
> forward iterator.  Check the docs.  :)
>

Well, I rejoiced a bit too fast. After reading the docs and playing a bit
with the multipass adapter I conclude that it does not address my issue.

As far as I can tell, the Lex library requires *better* than a Forward
iterator. Again, the error I get is :

boost/spirit/home/lex/lexer/lexertl/iterator_tokenizer.hpp:56
> error: no match for 'operator-' in 'start_token_ - 1'
   
The actual code is:

...
  if (BOL_state_ && (start_token_ == start_ ||
                    *(start_token_ - 1) == '\n'))
                {
....

If start_token_ only meets the requirements for a Forward iterator, then
(start_token_ - 1)  is not allowed.

I confirmed that using a wrapped stream_iterator<char> (i.e. with multipass
adapter) produces exactly the same error.

It seems odd that a straightforward mechanism which allow the of use
istream_iterators would not be provided. I can of course read the entire
file first into, say, a vector<char>, but this is ugly.

-Francois  



------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Spirit.lex Iteratorrequirements

by Hartmut Kaiser :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> >> What are the requirements on IteratorT in
> lex::lexertl::token<IteratorT>
> >> ?
> >
> > I believe it is the same everywhere in Spirit, you need a forward
> > iterator, and, well, an input iterator does not fulfill the
> > requirements for a forward iterator.  However, Spirit includes a
> > handy-dandy multi-pass wrapper that turns an input iterator into a
> > forward iterator.  Check the docs.  :)
> >

Sorry for the delayed answer, I was really busy with preparing the release.

> Well, I rejoiced a bit too fast. After reading the docs and playing a
> bit
> with the multipass adapter I conclude that it does not address my
> issue.

Currently the lexer requires a random_access_iterator for accessing the
underlying input data. I know that's a hard limitation and I'm sure we can
relax that a bit (bidirectional_iterator should be possible, I'm not sure
about it being a forward_iterator). It simply has not been done yet (but it
is on my list of things to fix). If that's an issue for you I can try to
work on this asap.

> As far as I can tell, the Lex library requires *better* than a Forward
> iterator. Again, the error I get is :
>
> boost/spirit/home/lex/lexer/lexertl/iterator_tokenizer.hpp:56
> > error: no match for 'operator-' in 'start_token_ - 1'
>
> The actual code is:
>
> ...
>   if (BOL_state_ && (start_token_ == start_ ||
>                     *(start_token_ - 1) == '\n'))
>                 {
> ....
>
> If start_token_ only meets the requirements for a Forward iterator,
> then
> (start_token_ - 1)  is not allowed.
>
> I confirmed that using a wrapped stream_iterator<char> (i.e. with
> multipass
> adapter) produces exactly the same error.

Yep, that's expected as multi_pass exposes a forward_iterator only.

> It seems odd that a straightforward mechanism which allow the of use
> istream_iterators would not be provided. I can of course read the
> entire
> file first into, say, a vector<char>, but this is ugly.

Reading the input into a std::string works as well...

Regards Hartmut

-------------------
Meet me at BoostCon
http://boostcon.com




------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Spirit.lex Iteratorrequirements

by Jean-Francois Ostiguy-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hartmut Kaiser wrote:


> Sorry for the delayed answer, I was really busy with preparing the
> release.
>

Hartmut -

I am certainly not expecting you to me answer right away ... or even at all
for that matter ;-)
I do, however, very much appreciate your efforts and your patience !

I had looked at spirit a few years ago for a project and at the time decided
to stick with flex/bison. I just recently found out about spirit 2.X and I
have to say I am impressed. That said, even if the library is well designed
and much better documented than most, the learning curve is steep.    

I can live with having to read a file into a string before lexing.
I do think that supporting istream_iterator in some way is
important ... after all, lexing/parser a file character stream is
very a common task. One should not need to allocate 1 Gb of memory to
lex a 1 Gb file.        

> Currently the lexer requires a random_access_iterator for accessing the
> underlying input data.

Under what circumstance(s) would you really need to access tokens randomly ?
Certainly in the specific case of the construct

  if (BOL_state_ && (start_token_ == start_ ||
                     *(start_token_ - 1) == '\n'))
                 
a bidirectional iterator e.g.

  *(--start_token_)

would do fine.

I would think that what is needed is a kind of 'limited' bidirectional
iterator implemented with a iostreambuf. If the buffer ever underflows (this
should be very rare if the buffer is large enough), a minimal implementation
one could simply generate a runtime error and ask the user to use a larger
the buffer.  
 
Regards

-Francois

 



------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general

Re: Spirit.lex Iteratorrequirements

by Hartmut Kaiser :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I am certainly not expecting you to me answer right away ... or even at
> all
> for that matter ;-)
> I do, however, very much appreciate your efforts and your patience !
>
> I had looked at spirit a few years ago for a project and at the time
> decided
> to stick with flex/bison. I just recently found out about spirit 2.X
> and I
> have to say I am impressed. That said, even if the library is well
> designed
> and much better documented than most, the learning curve is steep.

I'm fully aware the Lex docs being not complete yet. But the examples should
give you some insights of what's possible. Generally we tried to make
everything available to replace flex. Please don't hesitate to ask if you're
stuck or if you don't know how to implement a specific feature.

> I can live with having to read a file into a string before lexing.
> I do think that supporting istream_iterator in some way is
> important ... after all, lexing/parser a file character stream is
> very a common task. One should not need to allocate 1 Gb of memory to
> lex a 1 Gb file.

I know. Support for lesser underlying iterators is high on my todo list.

> > Currently the lexer requires a random_access_iterator for accessing
> the
> > underlying input data.
>
> Under what circumstance(s) would you really need to access tokens
> randomly ?
> Certainly in the specific case of the construct
>
>   if (BOL_state_ && (start_token_ == start_ ||
>                      *(start_token_ - 1) == '\n'))
>
> a bidirectional iterator e.g.
>
>   *(--start_token_)
>
> would do fine.

Not really this way, but something similar is definitely possible. My plan
is to provide different tokenization algorithms for different iterator
types. The more functionality can be relied on from the iterator the simpler
the tokenization will be.

> I would think that what is needed is a kind of 'limited' bidirectional
> iterator implemented with a iostreambuf. If the buffer ever underflows
> (this
> should be very rare if the buffer is large enough), a minimal
> implementation
> one could simply generate a runtime error and ask the user to use a
> larger
> the buffer.

Any input iterator can always be wrapped using multi_pass to get a forward
iterator. And I'm positive now to be able to implement a tokenization
algorithm for forward iterators. So you should be fine in the end. But this
will be something for the next release only. Additionally I'll probably add
some examples to demonstrate how this has to be implemented.

Thanks!
Regards Hartmut



------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Spirit-general mailing list
Spirit-general@...
https://lists.sourceforge.net/lists/listinfo/spirit-general