|
View:
New views
15 Messages
—
Rating Filter:
Alert me
|
|
|
Re: a bug in tokenize_and_parse ?> I am continuing my experiments with a standalone lexer.
> > When I call > > bool ok = lex::tokenize_and_parse( first, last, lexer, grammar); > > the function returns true even if parsing fails. > > It is easy enough to verify that the parsing failed since upon return > > (first == last) is false > > and 'first' indeed points to the token where the error occured. > > If the error occurs in the lexer, tokenize_and_parse returns false > and the 'first' pointer points to the correct token. In other words, > it seems like an error in the parser ( grammar) does not result in > lex::tokenize_and_parse(...) returning false when a separate lexer > is used. > > I am not sure if this qualifies as a "bug". > If the behavior is correct, then the example in Lex 'Quickstart 3 - > Counting Words Using a Parser' is misleading > ... > > bool r = lex::tokenize_and_parse(first, last, word_count, g); > > if (r) { > std::cout << "lines: " << g.l << ", words: " << g.w > << ", characters: " << g.c << "\n"; > } > else { > std::string rest(first, last); > std::cerr << "Parsing failed\n" << "stopped at: \"" > << rest << "\"\n"; > } > return 0; > .... > > since the function may return success even if the parsing fails along > the > way. Of course in this example the parser is mnay be too simple for > this > condition to occur, but one is left with the impression that testing > (r) is sufficient to establish success ... which it is not. > > Comments ? >From looking at the implementation of tokenize_and_parse I can't spot any problems. Could you provide us with a small example reproducing this behavior? I would consider it a bug if it behaved the way you're describing. Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
a bug in tokenize_and_parse ?Hello again.
I am continuing my experiments with a standalone lexer. When I call bool ok = lex::tokenize_and_parse( first, last, lexer, grammar); the function returns true even if parsing fails. It is easy enough to verify that the parsing failed since upon return (first == last) is false and 'first' indeed points to the token where the error occured. If the error occurs in the lexer, tokenize_and_parse returns false and the 'first' pointer points to the correct token. In other words, it seems like an error in the parser ( grammar) does not result in lex::tokenize_and_parse(...) returning false when a separate lexer is used. I am not sure if this qualifies as a "bug". If the behavior is correct, then the example in Lex 'Quickstart 3 - Counting Words Using a Parser' is misleading ... bool r = lex::tokenize_and_parse(first, last, word_count, g); if (r) { std::cout << "lines: " << g.l << ", words: " << g.w << ", characters: " << g.c << "\n"; } else { std::string rest(first, last); std::cerr << "Parsing failed\n" << "stopped at: \"" << rest << "\"\n"; } return 0; .... since the function may return success even if the parsing fails along the way. Of course in this example the parser is mnay be too simple for this condition to occur, but one is left with the impression that testing (r) is sufficient to establish success ... which it is not. Comments ? ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: a bug in tokenize_and_parse ?>>From looking at the implementation of tokenize_and_parse I can't spot any > problems. Could you provide us with a small example reproducing this > behavior? I would consider it a bug if it behaved the way you're > describing. > Ok. I attach a simple test. The parser parses a trivial toy language. First I use the "correct" syntax. In that case, all is well. All 7 lines are parsed. Here is the output: ----------------------------------------------------- Parsing : BEGIN section gaussian pmin = 0.0 pmax=2.17 qmin = 0.0 qmax = 1.0; gaussian pmin = 0.0 pmax=2.17 qmin = 0.0 qmax = 1.0; END section BEGIN section Identity; END section rule_dbg: delimiter rule_dbg: declaration rule_dbg: declaration rule_dbg: delimiter rule_dbg: delimiter rule_dbg: declaration rule_dbg: delimiter lex::tokenize_and_parse succeeds. (first == last) = 1 ---------------------------------------------- When I introduce a syntax error on line 6 that is, Identity; is replaced with Identity;; Even though a parse error occurs on line 6, tokenize_and_parse returns success. The output is ------------------------------------------------ Parsing : BEGIN section gaussian pmin = 0.0 pmax=2.17 qmin = 0.0 qmax = 1.0; gaussian pmin = 0.0 pmax=2.17 qmin = 0.0 qmax = 1.0; END section BEGIN section Identity;; END section rule_dbg: delimiter rule_dbg: declaration rule_dbg: declaration rule_dbg: delimiter rule_dbg: delimiter rule_dbg: declaration lex::tokenize_and_parse succeeds. (first == last) = 0 Parser failed at: END section [tokenize_and_parse_test.cc] //----------------------------------------------------------------- // tokenize_and_parse_test.cc // Demonstrates conversional lexing/parsing using boost.spirit 2.1 // ostiguy@... //----------------------------------------------------------------- #include <boost/spirit/include/lex_lexertl.hpp> #include <boost/spirit/include/qi.hpp> #include <boost/spirit/include/phoenix_statement.hpp> #include <boost/spirit/include/phoenix_operator.hpp> #include <boost/spirit/include/phoenix_statement.hpp> #include <boost/spirit/include/phoenix_bind.hpp> #include <string> namespace lex = boost::spirit::lex; template <typename Lexer> struct my_lexer : boost::spirit::lex::lexer<Lexer> { my_lexer() { delimiter = "BEGIN|END"; identifier = "[a-zA-Z][_\\.a-zA-Z0-9]*"; ws = "[ \\t\\n]+"; real = "([0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?)|([-+]?[1-9]+\\.?([eE][-+]?[0-9]+))"; integer = "[0-9]+"; boost::spirit::lex::lexer<Lexer>::self += ws[ lex::_pass = lex::pass_flags::pass_ignore]; boost::spirit::lex::lexer<Lexer>::self += delimiter; boost::spirit::lex::lexer<Lexer>::self += identifier; boost::spirit::lex::lexer<Lexer>::self += real; boost::spirit::lex::lexer<Lexer>::self += integer; boost::spirit::lex::lexer<Lexer>::self += '='; boost::spirit::lex::lexer<Lexer>::self += ';'; } lex::token_def<> ws; lex::token_def<std::string> identifier; lex::token_def<int> integer; lex::token_def<double> real; lex::token_def<double> delimiter; }; //|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| //|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| void rule_dbg(std::string const& str ) { std::cout << "rule_dbg: " << str << std::endl; } //|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| //|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| template <typename Iterator> struct my_grammar : boost::spirit::qi::grammar<Iterator> { template <typename TokenDef> my_grammar( TokenDef const& tok ) : my_grammar::base_type(statement) { using namespace boost::spirit::qi; using namespace boost::spirit::qi; using boost::spirit::_1; using boost::phoenix::bind; statement = eoi [bind(rule_dbg,"eoi") ] | *( delimiter [bind(rule_dbg,"delimiter") ] | declaration [bind(rule_dbg,"declaration")] ) ; delimiter = tok.delimiter >> tok.identifier; declaration = tok.identifier >> option >> ';'; option = *(tok.identifier >> '=' >> (tok.real|tok.integer) ); } boost::spirit::qi::rule<Iterator> statement, delimiter, declaration, option; }; //|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| //|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| typedef lex::lexertl::token<char const*> token_type; typedef lex::lexertl::actor_lexer<token_type> lexer_type; typedef my_lexer<lexer_type>::iterator_type iterator_type; #include <iostream> #include <sstream> using namespace std; int main( int argc, char* argv[] ) { string test_string="BEGIN section\n"; test_string += "gaussian pmin = 0.0 pmax=2.17 qmin = 0.0 qmax = 1.0;\n"; test_string += "gaussian pmin = 0.0 pmax=2.17 qmin = 0.0 qmax = 1.0;\n"; test_string += "END section\n"; test_string += "BEGIN section\n"; // WE INTRODUCE A SYNTAX ERROR: ";;" instead of ";" as a terminator. test_string += "Identity;;\n"; // THIS WILL MAKE THE PARSER FAIL //test_string += "Identity;\n"; // CORRECT SYNTAX test_string += "END section\n" ; cout << "Parsing : \n" << test_string << endl; char const* first = &test_string[0]; char const* last = &first[test_string.size()]; my_lexer<lexer_type> lexer; my_grammar<iterator_type> grammar(lexer); bool ok = lex::tokenize_and_parse( first, last, lexer, grammar ); if (ok ) { cout << "lex::tokenize_and_parse succeeds." << endl; } else { cout << "lex::tokenize_and_parse fails." << endl; } cout << "(first == last) = " << (first == last) << endl; if( first != last) { string rest( first,last ); cout << "Parser failed at: " << rest << endl; } } ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: a bug in tokenize_and_parse ?> >>From looking at the implementation of tokenize_and_parse I can't spot
> >>any > > problems. Could you provide us with a small example reproducing this > > behavior? I would consider it a bug if it behaved the way you're > > describing. > > > > Ok. I attach a simple test. > The parser parses a trivial toy language. Thanks! That example made it clear. The tokenize_and_parse functions now check whether the lexer has reached its end of input. This makes the return value semantically equivalent to the return value of tokenize(). I hope you don't mind me adding a new regression test based on your example. I'm not sure if we will be able to include this fix into the upcoming release, though. But I'll ask Beman after the beta has been finished. Some unrelated comment: If you make your token_def's carry an explicit token value (i.e. token_def<double>), I suggest to add the full list of used token value types to the token definition as well: typedef lex::lexertl::token<char const* , mpl::vector<std::string, double, int> > token_type; which enables late value conversion in the token type, making the whole lexing process more efficient. Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com > > First I use the "correct" syntax. In that case, all is well. All 7 > lines are parsed. Here is the output: > ----------------------------------------------------- > Parsing : > BEGIN section > gaussian pmin = 0.0 pmax=2.17 qmin = 0.0 qmax = 1.0; > gaussian pmin = 0.0 pmax=2.17 qmin = 0.0 qmax = 1.0; > END section > BEGIN section > Identity; > END section > > rule_dbg: delimiter > rule_dbg: declaration > rule_dbg: declaration > rule_dbg: delimiter > rule_dbg: delimiter > rule_dbg: declaration > rule_dbg: delimiter > lex::tokenize_and_parse succeeds. > (first == last) = 1 > ---------------------------------------------- > > When I introduce a syntax error on line 6 that is, > > Identity; > > is replaced with > > Identity;; > > Even though a parse error occurs on line 6, tokenize_and_parse returns > success. The output is > ------------------------------------------------ > > Parsing : > BEGIN section > gaussian pmin = 0.0 pmax=2.17 qmin = 0.0 qmax = 1.0; gaussian pmin = > 0.0 pmax=2.17 qmin = 0.0 qmax = 1.0; END section BEGIN section > Identity;; END section > > rule_dbg: delimiter > rule_dbg: declaration > rule_dbg: declaration > rule_dbg: delimiter > rule_dbg: delimiter > rule_dbg: declaration > lex::tokenize_and_parse succeeds. > (first == last) = 0 > Parser failed at: > END section ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
[lex] Is there a lex equivalent of distinct directive in QiHi,
By the way I am extremely impressed with the whole Spirit 2.1 frameworks and docs. An awesome piece of work. I want to write a lexer that uses the equivalent of the distinct directive that's now in the repository. What would you recommend as the best technique in Lex for doing this? Andy ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: a bug in tokenize_and_parse ?Hartmut Kaiser wrote:
> I hope you don't mind me adding a new regression test based on your > example. I'm not sure if we will be able to include this fix into the > upcoming release, though. But I'll ask Beman after the beta has been > finished. I am more than happy to make (a very very minor) contribution to such fine piece of work. Use my test case as you see fit. I sure hope that 1.41 will include the fix. Perhaps you should consider maintaining a list of "interesting" bugfixes on the spirit website (the list would provide a short explanation and refer to relevant svn commits so that patches can be retrieved). Another issue: standalone functions such as lex::tokenize_and_parse() and lex::tokenize_and_phrase_parse() do not seem to be formally documented, although lex::tokenize_and_parse is mentioned in one of the examples. There is also no discussion of why they should be used, instead of, say, the parse() member functions. > Some unrelated comment: > > If you make your token_def's carry an explicit token value (i.e. > token_def<double>), I suggest to add the full list of used token value > types to the token definition as well: > > typedef lex::lexertl::token<char const* > , mpl::vector<std::string, double, int> > token_type; > > which enables late value conversion in the token type, making the whole > lexing process more efficient. Thanks for this comment. I am barely beginning to be able to appreciate this kind of detail. I really admire the quality of the work that went into spirit and the dedication of its developpers. Regards -Francois ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: a bug in tokenize_and_parse ?> > I hope you don't mind me adding a new regression test based on your
> > example. I'm not sure if we will be able to include this fix into the > > upcoming release, though. But I'll ask Beman after the beta has been > > finished. > > I am more than happy to make (a very very minor) contribution to such > fine piece of work. Use my test case as you see fit. Ok, it's added to SVN now. > I sure hope that 1.41 will include the fix. Perhaps you should > consider maintaining a list of "interesting" bugfixes > on the spirit website (the list would provide a short explanation and > refer to relevant svn commits so that patches can be retrieved). Good point, we might want to do that during time between releases. > Another issue: standalone functions such as lex::tokenize_and_parse() > and lex::tokenize_and_phrase_parse() do not seem to be formally > documented, > although lex::tokenize_and_parse is mentioned in one of the examples. > There > is also no discussion of why they should be used, > instead of, say, the parse() member functions. Yes, the lexer docs are incomplete, sorry. I'm working on that, still. We concentrated on Qi/Karma docs for this release. > > Some unrelated comment: > > > > If you make your token_def's carry an explicit token value (i.e. > > token_def<double>), I suggest to add the full list of used token > value > > types to the token definition as well: > > > > typedef lex::lexertl::token<char const* > > , mpl::vector<std::string, double, int> > token_type; > > > > which enables late value conversion in the token type, making the > whole > > lexing process more efficient. > > Thanks for this comment. I am barely beginning to be able to appreciate > this > kind of detail. I really admire the quality of the work that went into > spirit and the dedication of its developpers. Thanks! Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: a bug in tokenize_and_parse ?> Another issue: standalone functions such as lex::tokenize_and_parse()
> and lex::tokenize_and_phrase_parse() do not seem to be formally > documented, > although lex::tokenize_and_parse is mentioned in one of the examples. > There > is also no discussion of why they should be used, > instead of, say, the parse() member functions. Added now here: http://tinyurl.com/yfw8oqq. Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: a bug in tokenize_and_parse ?Francois,
> > I hope you don't mind me adding a new regression test based on your > > example. I'm not sure if we will be able to include this fix into the > > upcoming release, though. But I'll ask Beman after the beta has been > > finished. > > I am more than happy to make (a very very minor) contribution to such > fine piece of work. Use my test case as you see fit. > > I sure hope that 1.41 will include the fix. Perhaps you should > consider maintaining a list of "interesting" bugfixes > on the spirit website (the list would provide a short explanation and > refer to relevant svn commits so that patches can be retrieved). I was thinking about this 'fix' ever since and came to the conclusion that I would like to revert that change. Here is my rationale: all Qi parse API functions return whether the parsing succeeded without checking whether the end of input (eoi) has been reached. That allows parsing of partial input while still getting the proper return value. The change I made to the lexer API functions (tokenize_and_parse, tokenize_and_phrase_parse) introduces different semantics because these functions now check for the eoi criteria as well. But I would like to keep the semantics as close as possible. Reverting this change would require a minor change to your code as you now need to check for the eoi criteria yourself by comparing the iterators after the tokenize_and_... functions returned: // old code bool ok = lex::tokenize_and_parse( first, last, lexer, grammar ); // new code bool ok = lex::tokenize_and_parse( first, last, lexer, grammar ) && first == last; How does this sound to you? Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: a bug in tokenize_and_parse ?Hartmut Kaiser wrote:
> I was thinking about this 'fix' ever since and came to the conclusion that > I would like to revert that change. > > Here is my rationale: all Qi parse API functions return whether the > parsing succeeded without checking whether the end of input (eoi) has been > reached. That allows parsing of partial input while still getting the > proper return value. > > The change I made to the lexer API functions (tokenize_and_parse, > tokenize_and_phrase_parse) introduces different semantics because these > functions now check for the eoi criteria as well. But I would like to keep > the semantics as close as possible. > > Reverting this change would require a minor change to your code as you now > need to check for the eoi criteria yourself by comparing the iterators > after the tokenize_and_... functions returned: > > // old code > bool ok = lex::tokenize_and_parse( first, last, lexer, grammar ); > > // new code > bool ok = lex::tokenize_and_parse( first, last, lexer, grammar ) && > first == last; > I think this is fine as long at it is documented. The issue is that the status code true does not imply success and this needs to be clear. Some of the examples also need to be modified because they imply that checking the the status code is sufficient, which it is not. -Francois ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: a bug in tokenize_and_parse ?> > I was thinking about this 'fix' ever since and came to the conclusion
> that > > I would like to revert that change. > > > > Here is my rationale: all Qi parse API functions return whether the > > parsing succeeded without checking whether the end of input (eoi) has > been > > reached. That allows parsing of partial input while still getting the > > proper return value. > > > > The change I made to the lexer API functions (tokenize_and_parse, > > tokenize_and_phrase_parse) introduces different semantics because > these > > functions now check for the eoi criteria as well. But I would like to > keep > > the semantics as close as possible. > > > > Reverting this change would require a minor change to your code as > you now > > need to check for the eoi criteria yourself by comparing the > iterators > > after the tokenize_and_... functions returned: > > > > // old code > > bool ok = lex::tokenize_and_parse( first, last, lexer, grammar ); > > > > // new code > > bool ok = lex::tokenize_and_parse( first, last, lexer, grammar ) && > > first == last; > > > > I think this is fine as long at it is documented. The issue is that the > status code true does not imply success and this needs to be clear. > Some of > the examples also need to be modified because they imply that checking > the > the status code is sufficient, which it is not. Makes sense. Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: a bug in tokenize_and_parse ?On Tue, Nov 3, 2009 at 11:28 AM, Jean-Francois Ostiguy <ostiguy@...> wrote:
> I think this is fine as long at it is documented. The issue is that the > status code true does not imply success and this needs to be clear. Some of > the examples also need to be modified because they imply that checking the > the status code is sufficient, which it is not. But it does indicate success. It does not mean that all of your input was parsed, but the parse did complete successfully. Thus I do not understand why you was it is not? ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: a bug in tokenize_and_parse ?OvermindDL1 wrote:
> But it does indicate success. It does not mean that all of your input > was parsed, but the parse did complete successfully. Thus I do not > understand why you was it is not? > May be this is not clear. Say you call bool success = tokenize_and_parse(start, end, lexer, parser); Assume you are parsing a file and there is a syntax error somewhere in the input; the current semantics is to return a status of "true" (success) even if parsing has stopped due to incorrect syntax. You say: if not all input was parsed, this is "success". So how do you define "failure" i.e. what does success = false means ? As I understand it, it means that lexing has failed. My definition of success is: both lexing and parsing are completely successful. This is what I want do test for in my code before proceeding to other tasks. For this, I need to do something like bool success = tokenize_and_parse(start, end, lexer, parser) && (start == end); or more likely bool success = tokenize_and_parse(start, end, lexer, parser); success = (success && (start == end)); since I think that with the first form one cannot assume that that tokenize_and_parse( ... ) would be evaluated first. The bottom line is that it is not worth to make a Federal case of the specific semantics. I do think. however, that it needs to be clearly documented. -Francois ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: a bug in tokenize_and_parse ?> > But it does indicate success. It does not mean that all of your
> input > > was parsed, but the parse did complete successfully. Thus I do not > > understand why you was it is not? > > > > May be this is not clear. > > Say you call > > bool success = tokenize_and_parse(start, end, lexer, parser); > > Assume you are parsing a file and there is a syntax error somewhere in > the > input; the current semantics is to return a status of "true" (success) > even > if parsing has stopped due to incorrect syntax. That's not true. Your parser returned true even in case of an error because your top level rule was a Kleene expression (unary operator*()) which by design always succeeds, even if matching nothing. It's semantics are 'match zero or more items of something', so any number of successfully matched items is a success. You didn't write the grammar requiring to match the whole input. > You say: if not all input was parsed, this is "success". > So how do you define "failure" i.e. what does success = false means ? > As I understand it, it means that lexing has failed. Success means the parser returned success. That might happen even if the input has been matched partially only. If you want to ensure your parser ate all the input you need either append a '>> qi::eoi' to your grammar or check whether the iterators are equal after parsing. > My definition of success is: both lexing and parsing are completely > successful. Sure, I agree. But if your parser thinks everything is ok, then tokenize_and_parse can't tell differently. > This is what I want do test for in my code before proceeding to > other tasks. > > For this, I need to do something like > > bool success = > tokenize_and_parse(start, end, lexer, parser) && (start == end); > > > or more likely > > bool success = tokenize_and_parse(start, end, lexer, parser); > success = (success && (start == end)); > > since I think that with the first form one cannot assume that > that tokenize_and_parse( ... ) would be evaluated first. The Standard guarantees the first form to be correct, always. > The bottom line is that it is not worth to make a Federal case of the > specific semantics. I do think. however, that it needs to be clearly > documented. Sure, agreed as well. Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: a bug in tokenize_and_parse ?Hartmut Kaiser wrote:
> That's not true. Your parser returned true even in case of an error > because your top level rule was a Kleene expression (unary operator*()) > which by design always succeeds, even if matching nothing. It's semantics > are 'match zero or more items of something', so any number of successfully > matched items is a success. You didn't write the grammar requiring to > match the whole input. > Interesting; I certainly missed that nuance. I suppose one would need something like rule = eoi | +( statement ); >> bool success = tokenize_and_parse(start, end, lexer, parser); >> success = (success && (start == end)); >> >> since I think that with the first form one cannot assume that >> that tokenize_and_parse( ... ) would be evaluated first. > > The Standard guarantees the first form to be correct, always. I was not sure about that ... but it is reassuring that the Standard guarantees the order of evaluation in a logical expression ;-) >> The bottom line is that it is not worth to make a Federal case of the >> specific semantics. I do think. however, that it needs to be clearly >> documented. > > Sure, agreed as well. > Thank you for the explanation. At this point, I think your decision to leave things the way they were in the first place is definitely the correct one. -Francois ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
| Free embeddable forum powered by Nabble | Forum Help |