|
View:
New views
13 Messages
—
Rating Filter:
Alert me
|
|
|
Boost.Lex and Boost.Qi reasonsI am curious, I have yet to see something in Boost.Lex that Boost.Qi
could not do and execute faster with directly. Why use Boost.Lex to parse into a token stream then parse that into your data, why not just do it all at once, and much faster to boot, by just using Boost.Qi itself only? I just do not see the reasons to Lex something first, so I am just trying to understand. ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex and Boost.Qi reasonsOvermindDL1 wrote:
> I am curious, I have yet to see something in Boost.Lex that Boost.Qi > could not do and execute faster with directly. Why use Boost.Lex to > parse into a token stream then parse that into your data, why not just > do it all at once, and much faster to boot, by just using Boost.Qi > itself only? I just do not see the reasons to Lex something first, so > I am just trying to understand. In theory, the deterministic nature of Lex (using a DFA) makes it faster than Qi, which is a backtracking recursive descent parser with potentially quadratic worst-case performance (esp. with poorly written code). This is especially true at the token level. In practice... well... your mileage may vary. I really haven't benchmarked specific applications. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net http://www.facebook.com/djowel Meet me at BoostCon http://www.boostcon.com/home http://www.facebook.com/boostcon ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex and Boost.Qi reasonsOn Sat, Oct 31, 2009 at 9:42 PM, Joel de Guzman
<joel@...> wrote: > OvermindDL1 wrote: >> I am curious, I have yet to see something in Boost.Lex that Boost.Qi >> could not do and execute faster with directly. Why use Boost.Lex to >> parse into a token stream then parse that into your data, why not just >> do it all at once, and much faster to boot, by just using Boost.Qi >> itself only? I just do not see the reasons to Lex something first, so >> I am just trying to understand. > > In theory, the deterministic nature of Lex (using a DFA) makes it > faster than Qi, which is a backtracking recursive descent parser > with potentially quadratic worst-case performance (esp. with > poorly written code). This is especially true at the token level. > > In practice... well... your mileage may vary. I really haven't > benchmarked specific applications. Hmm, I just instinctively thought it would be slower due to its use of regex (not statically compiled) and the fact it is another pass you do before running QI anyway to get your final parsed data. I am curious, we should do benchmarks... ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex and Boost.Qi reasonsOvermindDL1 wrote:
> On Sat, Oct 31, 2009 at 9:42 PM, Joel de Guzman > <joel@...> wrote: >> OvermindDL1 wrote: >>> I am curious, I have yet to see something in Boost.Lex that Boost.Qi >>> could not do and execute faster with directly. Why use Boost.Lex to >>> parse into a token stream then parse that into your data, why not just >>> do it all at once, and much faster to boot, by just using Boost.Qi >>> itself only? I just do not see the reasons to Lex something first, so >>> I am just trying to understand. >> In theory, the deterministic nature of Lex (using a DFA) makes it >> faster than Qi, which is a backtracking recursive descent parser >> with potentially quadratic worst-case performance (esp. with >> poorly written code). This is especially true at the token level. >> >> In practice... well... your mileage may vary. I really haven't >> benchmarked specific applications. > > Hmm, I just instinctively thought it would be slower due to its use of > regex (not statically compiled) and the fact it is another pass you do > before running QI anyway to get your final parsed data. I am curious, > we should do benchmarks... You have a point. While the big-O complexity is, in theory, better than RD, the hidden constants may be magnitudes apart. With RD, we have the optimizer in our favor. With anything that's "interpreted", such as DFAs, the limit of optimization goes as far as the implementation. It makes me really curious now as to what the performance is vs. YACC, for example. It blew me away when I saw benchmarks against ANTLR/PCCTS, for example. LL(1) being another table based interpreter. I recall at one point in the comp.compilers NG that some say LL1 is faster than LALR/LR albeit with a lesser capable grammar. With pure lexing though, I still think a DFA will beat Qi. Ben Hanson had some benchmarks against "classic" spirit showing Lex's edge. I wonder how Qi compares now. I also wonder how Lex+Qi will fare vs. straight Qi. What I haven't mentioned though is that, the fact that they are interpreted means you can stop the execution at any time. That means that you can have both push and pull parsing. Not so with RD which is just straight code using real stacks (aside from using continuations or threads, or some such). Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net http://www.facebook.com/djowel Meet me at BoostCon http://www.boostcon.com/home http://www.facebook.com/boostcon ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex and Boost.Qi reasons> I am curious, I have yet to see something in Boost.Lex that Boost.Qi
> could not do and execute faster with directly. Why use Boost.Lex to > parse into a token stream then parse that into your data, why not just > do it all at once, and much faster to boot, by just using Boost.Qi > itself only? I just do not see the reasons to Lex something first, so > I am just trying to understand. A lexer will be faster than a pure lexerless solution for grammars requiring heavy backtracking. There lexing gives you 2 advantages a) it avoids rescanning the input, this is done exactly once b) it avoids re-converting input into token values, this is done once as well This is achieved by two means: the multi_pass iterator adaptor and some clever lazy token value conversion which is executed only when first needed. Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex and Boost.Qi reasonsOn 01.11.2009 03:51, OvermindDL1 wrote:
> I am curious, I have yet to see something in Boost.Lex that Boost.Qi > could not do and execute faster with directly. Why use Boost.Lex to > parse into a token stream then parse that into your data, why not just > do it all at once, and much faster to boot, by just using Boost.Qi > itself only? I just do not see the reasons to Lex something first, so > I am just trying to understand. > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry(R) Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9 - 12, 2009. Register now! > http://p.sf.net/sfu/devconference > _______________________________________________ > Spirit-general mailing list > Spirit-general@... > https://lists.sourceforge.net/lists/listinfo/spirit-general > Aside from performance reasons, another use case may be to parse something in two stages. In my company we use XML documents in a lot of applications. Every developer has his own solution mostly build on top of DOM trees. So we have decided to unify the access to those files. This has been done with Lex and Qi. I have written a lexer which adapts an arbitrary XML parser. It generates tokens for start, end tags, values etc. As within a normal lexer, regular expressions can be provided, to differentiate tags. These tokens can then be parsed with a Qi grammar. This gives us a formally defined document structure which can be easily read and changed. No more hard to track recursion etc. And with a good XML parser this is magnitudes faster than building and traversing a DOM tree. :-) Regards Henning ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex and Boost.Qi reasonsIn article <4aed87e6.8802be0a.4eaa.2e75@...>, "Hartmut Kaiser" <hartmut.kaiser@...> writes: > This is achieved by two means: the multi_pass iterator adaptor and some > clever lazy token value conversion which is executed only when first needed. Is there an example of this in the docs? -- "The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download <http://legalizeadulthood.wordpress.com/the-direct3d-graphics-pipeline/> Legalize Adulthood! <http://legalizeadulthood.wordpress.com> ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex and Boost.Qi reasonsOn Sun, Nov 1, 2009 at 9:29 AM, Richard <legalize@...> wrote:
> > In article <4aed87e6.8802be0a.4eaa.2e75@...>, > "Hartmut Kaiser" <hartmut.kaiser@...> writes: > >> This is achieved by two means: the multi_pass iterator adaptor and some >> clever lazy token value conversion which is executed only when first needed. > > Is there an example of this in the docs? Yea, I would prefer a good example of why you would want to use Lex as well. Even given the example from Henning, I still see a way that could be done in pure QI, and the code would be shorter, and I do not see how it could not execute faster either... Is there an actual example of full text to data structure usage that proves that using Lex first is faster then QI in some real-world case that I can benchmark? ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex and Boost.Qi reasons> > In article <4aed87e6.8802be0a.4eaa.2e75@...>,
> > "Hartmut Kaiser" <hartmut.kaiser@...> writes: > > > >> This is achieved by two means: the multi_pass iterator adaptor and > some > >> clever lazy token value conversion which is executed only when first > needed. > > > > Is there an example of this in the docs? > > Yea, I would prefer a good example of why you would want to use Lex as > well. Even given the example from Henning, I still see a way that > could be done in pure QI, and the code would be shorter, and I do not > see how it could not execute faster either... > > Is there an actual example of full text to data structure usage that > proves that using Lex first is faster then QI in some real-world case > that I can benchmark? I don't have such an example handy. But I guess lexing C++ tokens as a pre-step for a C++ parser is a good example for this. C++ is very difficult to parse and because of its ambiguities can result in deep backtracking (just think of the nice template<A > B> construct...) Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex and Boost.Qi reasonsOn Sun, Nov 1, 2009 at 7:54 PM, Hartmut Kaiser <hartmut.kaiser@...> wrote:
> But I guess lexing C++ tokens as a pre-step for a C++ parser is a good example for this. C++ is very difficult to parse and because of its ambiguities can result in deep backtracking (just think of the nice template<A > B> construct...) [slightly offtopic] Hence one reason I like the fact that D changed templates from <> to !() instead, unambiguous, it removes all ambiguities in C++ syntax. :) ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex and Boost.Qi reasonsHartmut Kaiser wrote:
>>> In article <4aed87e6.8802be0a.4eaa.2e75@...>, >>> "Hartmut Kaiser" <hartmut.kaiser@...> writes: >>> >>>> This is achieved by two means: the multi_pass iterator adaptor and >> some >>>> clever lazy token value conversion which is executed only when first >> needed. >>> Is there an example of this in the docs? >> Yea, I would prefer a good example of why you would want to use Lex as >> well. Even given the example from Henning, I still see a way that >> could be done in pure QI, and the code would be shorter, and I do not >> see how it could not execute faster either... >> >> Is there an actual example of full text to data structure usage that >> proves that using Lex first is faster then QI in some real-world case >> that I can benchmark? > > I don't have such an example handy. > > But I guess lexing C++ tokens as a pre-step for a C++ parser is a good example for this. C++ is very difficult to parse and because of its ambiguities can result in deep backtracking (just think of the nice template<A > B> construct...) Also, re: performance, Ben Hanson told me that he can write a variant of lexertl that uses tight c++ switch like RE2C. That will definitely be smokin hot! Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net http://www.facebook.com/djowel Meet me at BoostCon http://www.boostcon.com/home http://www.facebook.com/boostcon ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex and Boost.Qi reasonsOn Nov 1, 2009, at 10:46 PM, Joel de Guzman wrote: te<A > B> construct...) > > Also, re: performance, Ben Hanson told me that he can write a variant > of lexertl that uses tight c++ switch like RE2C. That will definitely > be smokin hot! Interesting... As for examples any real programming language comes to mind. Also if your lex-less parser is jumping thru loops to get around qi's op * ()'s greediness, lots of things that are easier written like *( a | b) >> b. [example odd binary numbers of unspecified size. "[01]*1" will succeed but *(char_("01") >> char_('1') ; will not. it can be done with a local bool like *(char_("01")[_a = (_1 == '1')) [_pass = _a] but this can get messy. I'd say also that if the lexing is 'hairy' consider lexing. Write a lexer and test it when it produces proper tokens, make it a static lexer. Make / make-like programs could automatically revert to a dynamic lexer if the lexer code is changed. But not back to static without intervention. Also a lot of keywords might be better in lex as it is probably O (n) instead of O(nlogn) which symbols is close to. but qi is fine for small problems without lex. 'real' programming languages probably not. as for the 'random access iterator' problem of lex fix it. It would be nice if it could easily be decoupled from the source iterators so the source code does not need to stick around for qi. the a lexer could flush its buffers an any pass_normal,pass_ignore, since lex is essentially linear, once the match is found it does not backtrack. [At least I can't see why it would] ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: Boost.Lex and Boost.Qi reasonsCarl Barron wrote:
> Also a lot of keywords might be better in lex as it is probably O > (n) instead of O(nlogn) which symbols is close to. Again, this has to be tested. This is my basic assumption too, but you can never really tell. The hidden K is unseen. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net http://www.facebook.com/djowel Meet me at BoostCon http://www.boostcon.com/home http://www.facebook.com/boostcon ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
| Free embeddable forum powered by Nabble | Forum Help |