|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
Boost.Lex ... how is a token value initialized ?I am back again ;-(
I am now trying to understand something that should be simple, but so far, I have been unable to find a satisfactory answer just by consulting the documentation. I admit that I did not read _all_ the documentation ... but I am trying to avoid diving into all of Phoenix , Variant etc, ... at least for now. So the question is this: I have a token defined as follows integer = "[0-9]+"; .... lex::token_def<int> integer; When the token is matched, something must set its value attribute to the the binary representation of an int. This requires calling a function to translate the matched ascii string that represents the integer. In flex, one would do something like this. {integer} { yylval->ival=atoi(yytext); return token::INT_TOKEN; } In this case, I think I am supposed to use a semantic action of some kind, integer[ val_ = ??? ] but what goes on the rhs ( at this point I am not even sure about the lhs) ? Am I supposed to write my own lambda/Phoenix expression ? Do I use _start, _end iterators to copy the matched string into a stringstream and read it back into an int ? Is there a better, pre-defined, solution ? This is not clear at all. The "quick start" examples are not very useful, since they conveniently avoid the issue by merely counting tokens, never initializing any value attribute. So far, all my attempts at writing a suitable semantic action have resulted an orgy of template errors. An example would go a long way ... -Francois ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex ... how is a token value initialized ?Francois
> I am back again ;-( Welcome back! :-) > I am now trying to understand something that should be simple, > but so far, I have been unable to find a satisfactory answer just by > consulting the documentation. I admit that I did not read _all_ the > documentation ... but I am trying to avoid diving into all of > Phoenix , Variant etc, ... at least for now. > > So the question is this: > > I have a token defined as follows > > integer = "[0-9]+"; > .... > > lex::token_def<int> integer; > > When the token is matched, something must set its value attribute to > the > the binary representation of an int. This requires calling a function > to > translate the matched ascii string that represents the integer. > In flex, one would do something like this. > > > {integer} { yylval->ival=atoi(yytext); > return token::INT_TOKEN; > } > > In this case, I think I am supposed to use a semantic action of some > kind, > > integer[ val_ = ??? ] > > but what goes on the rhs ( at this point I am not even sure about the > lhs) ? > Am I supposed to write my own lambda/Phoenix expression ? Do I use > _start, > _end iterators to copy the matched string into a stringstream and read > it > back into an int ? Is there a better, pre-defined, solution ? The trick is that you don't have to do anything. By specifying the token value type to be int you're defining the attribute type of this token definition if used as a parser as well. Spirit.Lex knows how to convert all build in types from the matched input. So not need to attach any semantic actions to the lexer: Let's have a look at an example: template <typename Lexer> struct print_numbers_tokens : lex::lexer<Lexer> { print_numbers_tokens() : print_numbers_tokens::base_type() { integer = "[1-9][0-9]*"; this->self = integer | string(".")[lex::_pass = lex::pass_flags::pass_ignore] ; } lex::token_def<int> integer; }; template <typename Iterator> struct print_numbers_grammar : qi::grammar<Iterator> { print_numbers_grammar(print_numbers_tokens& def) : print_numbers_grammar::base_type(start) { start = * def.integer[std::cout << qi::_1 << "\n"]; } qi::rule<Iterator> start; }; This will print all integer numbers in a file, ignoring everything else. The qi::_1 in the semantic action of the start rule refers to the 'int' attribute exposed by the token definition (and it actually is of type 'int'). The conversion of the matched input sequence to int will be executed on demand only, while it's accessed for the first time. If the same token happens to be inspected for a second time (because of backtracking in the parser) the integer will be still available without any need to be converted from the input string again. > This is not clear at all. The "quick start" examples are not very > useful, > since they conveniently avoid the issue by merely counting tokens, > never > initializing any value attribute. So far, all my attempts at writing a > suitable semantic action have resulted an orgy of template errors. Again, I'm sorry for the incomplete documentation, I'll try to catch up as soon as possible. Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex ... how is a token value initialized?Hartmut Kaiser wrote:
> The trick is that you don't have to do anything. By specifying the token > value type to be int you're defining the attribute type of this token > definition if used as a parser as well. Spirit.Lex knows how to convert > all build in types from the matched input. So not need to attach any > semantic actions to the lexer: > Hartmut - Thank you for your careful and detailed explanation ... There are things that remain rather nebulous. I used 'integer' as an example because it does not hold a string as a value attribute. While having automatic conversion for some built-in types is nice, what happens when the value attribute to be an instance of some unspecified (user-defined) class ? Surely in general, I need to provide some code to explain to the lexer how the conversion is done. To fix ideas, suppose I have a class for rational numbers and each rational number is represented by { n, m } i.e. { 3,4 } would be 3/4. I want to tokenize "{3,4}" and store an attribute of type Rational in my token value attribute using the constructor Rational(3,4) to initialize it. In that case, I think would have to write a custom semantic action. I am not sure how I would do this ... How does my custom semantic action get access to the token string representation and to the value attribute ? Again, thanks for your patience. -Francois ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex ... how is a token value initialized?> > The trick is that you don't have to do anything. By specifying the
> token > > value type to be int you're defining the attribute type of this token > > definition if used as a parser as well. Spirit.Lex knows how to > convert > > all build in types from the matched input. So not need to attach any > > semantic actions to the lexer: > > Hartmut - > > Thank you for your careful and detailed explanation ... There are > things > that remain rather nebulous. > > I used 'integer' as an example because it does not hold a string as a > value > attribute. While having automatic conversion for some built-in types is > nice, what happens when the value attribute to be an instance of some > unspecified (user-defined) class ? > > Surely in general, I need to provide some code to explain to the lexer > how > the conversion is done. To fix ideas, suppose I have a class for > rational > numbers and each rational number is represented > by { n, m } i.e. { 3,4 } would be 3/4. I want to tokenize "{3,4}" and > store an attribute of type Rational in my token value attribute using > the > constructor Rational(3,4) to initialize it. In that case, I think would > have > to write a custom semantic action. I am not sure how I would do this > ... How > does my custom semantic action get access to the token string > representation and to the value attribute ? Good question, and I have to admit this is not documented yet (it is a missing paragraph in the section 'Customization of Spirit's Attribute Handling', I'll add it asap). For user-defined types you need to specialize the following template: // this is the default/main template definition (contained in Spirit) namespace boost { namespace spirit { namespace traits { template <typename Attribute, typename Iterator , typename Enable /* = void*/> struct assign_to_attribute_from_iterators { static void call(Iterator const& first, Iterator const& last, Attribute& attr) { attr = Attribute(first, last); } }; }}} // this is an example for a user-defined type foo namespace boost { namespace spirit { namespace traits { template <typename Iterator> struct assign_to_attribute_from_iterators<foo, Iterator> { static void call(Iterator const& first, Iterator const& last, foo& attr) { attr = foo(first, last); // construct foo from iterators } }; }}} The iterators passed to call() point to the matched input sequence. Spirit will use this specialization for conversion of the iterator pair to your data type. Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
| Free embeddable forum powered by Nabble | Forum Help |