|
View:
New views
7 Messages
—
Rating Filter:
Alert me
|
|
|
SableCC 4 and old grammar code baseHi,
One problem I found with SableCC, the only one in fact, was the poor grammar base availlable. Finding a javascript grammar was really hard compared to other lexer/ parser/walker generator. If SableCC 4 change the grammar syntax, what is the future for that already small grammar base ? Will SableCC 4 be retrocompatible with old grammar ? Will a converter (of course done with SableCC) from old grammar to new grammar be available ? Other idea ? Thanks ! _______________________________________________ SableCC-Discussion mailing list SableCC-Discussion@... http://lists.sablecc.org/listinfo/sablecc-discussion |
|
|
Re: SableCC 4 and old grammar code baseThe SableCC 4 syntax is quite close to the previous syntax. I needed to
change the syntax, a little, to fix problems in the old one, and to allow for new features. As an illustration, I just converted, in less than 5 minutes, the SableCC 3 MiniBasic grammar to SableCC 4 syntax. I have attached both grammars so that you can look at the differences. In the attached SableCC 4 version: 1- Both Helpers and Tokens sections are merged into a single Lexer section. Ignored is a subsection of Lexer. 2- Numeric character literals use a # (e.g. #10). 3- No more sets. [[32..127] - [cr + lf]] becomes (#32..#127) - (cr | lf) . 4- The Productions section is now called Parser. 5- alternative and element names are written as {alt_name:} and [elem_name:]. As you see, a direct conversion should be simple enough. It could probably be automated, except for grammars that use lexer states. But, the resulting SableCC 4 grammar is not as elegant as SableCC 4 allows. I'll send another message with a more elegant MiniBasic grammar, using the new syntax. Etienne Jean-Baptiste BRIAUD -- Novlog a écrit : > One problem I found with SableCC, the only one in fact, was the poor > grammar base availlable. > Finding a javascript grammar was really hard compared to other > lexer/parser/walker generator. > > If SableCC 4 change the grammar syntax, what is the future for that > already small grammar base ? > Will SableCC 4 be retrocompatible with old grammar ? > Will a converter (of course done with SableCC) from old grammar to new > grammar be available ? > Other idea ? Etienne M. Gagnon, Ph.D. SableCC: http://sablecc.org Package org.sablecc.minibasic; Helpers letter = ['A'..'Z']; digit = ['0'..'9']; cr = 13; lf = 10; not_cr_lf = [[32..127] - [cr + lf]]; Tokens if = 'IF'; then = 'THEN'; else = 'ELSE'; endif = 'ENDIF'; for = 'FOR'; to = 'TO'; next = 'NEXT'; read = 'READ'; print = 'PRINT'; println = 'PRINTLN'; assign = ':='; less_than = '<'; greater_than = '>'; equal = '='; plus = '+'; minus = '-'; mult = '*'; div = '/'; mod = 'MOD'; l_par = '('; r_par = ')'; identifier = letter (letter | digit)*; number = digit+; string = '"' [not_cr_lf - '"']* '"'; new_line = cr | lf | cr lf; blank = ' '*; Ignored Tokens blank; Productions statements = {list} statement statements | {empty} ; statement = {if} if condition then [nl1]:new_line statements optional_else endif [nl2]:new_line | {for} for identifier assign [from_exp]:expression to [to_exp]:expression [nl1]:new_line statements next [nl2]:new_line | {read} read identifier new_line | {print_exp} print expression new_line | {print_str} print string new_line | {println} println new_line | {assignment} identifier assign expression new_line; optional_else = {else} else new_line statements | {empty} ; condition = {less_than} [left]:expression less_than [right]:expression | {greater_than} [left]:expression greater_than [right]:expression | {equal} [left]:expression equal [right]:expression; expression = {value} value | {plus} [left]:value plus [right]:value | {minus} [left]:value minus [right]:value | {mult} [left]:value mult [right]:value | {div} [left]:value div [right]:value | {mod} [left]:value mod [right]:value; value = {constant} number | {identifier} identifier | {expression} l_par expression r_par; Language minibasic; Lexer letter = 'A'..'Z'; digit = '0'..'9'; cr = #13; lf = #10; not_cr_lf = (#32..#127) - (cr | lf); if = 'IF'; then = 'THEN'; else = 'ELSE'; endif = 'ENDIF'; for = 'FOR'; to = 'TO'; next = 'NEXT'; read = 'READ'; print = 'PRINT'; println = 'PRINTLN'; assign = ':='; less_than = '<'; greater_than = '>'; equal = '='; plus = '+'; minus = '-'; mult = '*'; div = '/'; mod = 'MOD'; l_par = '('; r_par = ')'; identifier = letter (letter | digit)*; number = digit+; string = '"' (not_cr_lf - '"')* '"'; new_line = cr | lf | cr lf; blank = ' '*; Ignored blank; Parser statements = {list:} statement statements | {empty:} ; statement = {if:} if condition then [nl1:]new_line statements optional_else endif [nl2:]new_line | {for:} for identifier assign [from_exp:]expression to [to_exp:]expression [nl1:]new_line statements next [nl2:]new_line | {read:} read identifier new_line | {print_exp:} print expression new_line | {print_str:} print string new_line | {println:} println new_line | {assignment:} identifier assign expression new_line; optional_else = {else:} else new_line statements | {empty:} ; condition = {less_than:} [left:]expression less_than [right:]expression | {greater_than:} [left:]expression greater_than [right:]expression | {equal:} [left:]expression equal [right:]expression; expression = {value:} value | {plus:} [left:]value plus [right:]value | {minus:} [left:]value minus [right:]value | {mult:} [left:]value mult [right:]value | {div:} [left:]value div [right:]value | {mod:} [left:]value mod [right:]value; value = {constant:} number | {identifier:} identifier | {expression:} l_par expression r_par; _______________________________________________ SableCC-Discussion mailing list SableCC-Discussion@... http://lists.sablecc.org/listinfo/sablecc-discussion |
|
|
Re: SableCC 4 and old grammar code baseHere's a more elegant version of the MiniBasic grammar.
Note that I didn't change the language. I could have used new features to get more powerful expressions, for example. But, that was not the objective here. The objective was only to assure you that the new syntax is close enough to the old one as to make the transition very, very easy for current SableCC users and for converting old grammars, and that it is actually much better. Have fun! Etienne -- Etienne M. Gagnon, Ph.D. SableCC: http://sablecc.org Language minibasic; Lexer letter = 'A'..'Z'; digit = '0'..'9'; cr = #13; lf = #10; not_cr_lf = (#32..#127) - (cr | lf); identifier = letter (letter | digit)*; number = digit+; string = '"' (not_cr_lf - '"')* '"'; new_line = cr | lf | cr lf; blank = ' '+; Ignored blank; Parser statements = statement*; statement = {if:} 'IF' condition 'THEN' new_line statements else_part? 'ENDIF' new_line | {for:} 'FOR' identifier ':=' [from_exp:]exp 'TO' [to_exp:]exp new_line statements 'NEXT' new_line | {read:} 'READ' identifier new_line | {print_exp:} 'PRINT' exp new_line | {print_str:} 'PRINT' string new_line | {println:} 'PRINTLN' new_line | {assignment:} identifier ':=' exp new_line; else_part = 'ELSE' new_line statements; condition = {less_than:} [left_exp:]exp '<' [right_exp:]exp | {greater_than:} [left_exp:]exp '>' [right_exp:]exp | {equal:} [left_exp:]exp '=' [right_exp:]exp; exp = {value:} value | {plus:} [left_exp:]value '+' [right_exp:]value | {minus:} [left_exp:]value '-' [right_exp:]value | {mult:} [left_exp:]value '*' [right_exp:]value | {div:} [left_exp:]value '/' [right_exp:]value | {mod:} [left_exp:]value 'MOD' [right_exp:]value; value = {constant:} number | {identifier:} identifier | {expression:} '(' exp ')'; _______________________________________________ SableCC-Discussion mailing list SableCC-Discussion@... http://lists.sablecc.org/listinfo/sablecc-discussion |
|
|
Re: SableCC 4 and old grammar code baseJust a note to those that have participated to the Unicode identifiers
discussion. I have not forgotten about it. I just changed the proposed solution. Read below. The old SableCC approach for identifiers and keywords was simply too useful to throw away. Using Lexer, instead of $lexer is more visually attractive, for one thing. But mostly, the camel case conversion of old identifiers was just too useful. Being able to convert some_name to SomeName without problems (e.g. ambiguous upper case or no concept of lower/upper case in some scripts) is very convenient. So, I decided to retain the old "pure ASCII" identifiers (with the old rules: no upper case, etc.). But, I also allow for rich identifiers, made up of Unicode characters. A rich identifier is enclosed within "<" and ">", and it may not contain the underscore "_" character. This way, I can get unambiguous conversions and I am also able to concatenate identifiers to create new names. e.g. prod_name = {alt_name:} ... | ...; Generates: PProdName, AProdName_AltName (yes, different from SableCC3, but it eliminates name conflicts). <Gagnon> = {<Étienne>:} ... | ...; Generates: P_Gagnon, A_Gagnon__Étienne. In other words, rich identifiers are converted by adding a "_" prefix. This way, we (hopefully) get to please everybody. We make things easy for normal uses, and possible for complex uses. Etienne Etienne M. Gagnon wrote: > 1- Both Helpers and Tokens sections are merged into a single Lexer > section. Ignored is a subsection of Lexer. > [...] -- Etienne M. Gagnon, Ph.D. SableCC: http://sablecc.org _______________________________________________ SableCC-Discussion mailing list SableCC-Discussion@... http://lists.sablecc.org/listinfo/sablecc-discussion |
|
|
RE: SableCC 4 and old grammar code baseWow, that's interesting! That means I'll need to update my Compiler Design textbook to use sablecc version 4, but I'll also need to maintain the existing book for people who are still using sablecc version 3. It's a good thing the whole book is available on the web, so that users can select which ever version they wish.
Seth D. Bergmann Associate Professor Computer Science bergmann@... Rowan University 856-256-4500 ext. 3197 Glassboro NJ 08028 Fax: 856-256-4741 -----Original Message----- From: sablecc-discussion-bounces+bergmann=rowan.edu@... [mailto:sablecc-discussion-bounces+bergmann=rowan.edu@...] On Behalf Of Etienne M. Gagnon Sent: Friday, March 06, 2009 10:22 AM To: Discussion mailing list for the SableCC project Subject: Re: SableCC 4 and old grammar code base The SableCC 4 syntax is quite close to the previous syntax. I needed to change the syntax, a little, to fix problems in the old one, and to allow for new features. As an illustration, I just converted, in less than 5 minutes, the SableCC 3 MiniBasic grammar to SableCC 4 syntax. I have attached both grammars so that you can look at the differences. In the attached SableCC 4 version: 1- Both Helpers and Tokens sections are merged into a single Lexer section. Ignored is a subsection of Lexer. 2- Numeric character literals use a # (e.g. #10). 3- No more sets. [[32..127] - [cr + lf]] becomes (#32..#127) - (cr | lf) . 4- The Productions section is now called Parser. 5- alternative and element names are written as {alt_name:} and [elem_name:]. As you see, a direct conversion should be simple enough. It could probably be automated, except for grammars that use lexer states. But, the resulting SableCC 4 grammar is not as elegant as SableCC 4 allows. I'll send another message with a more elegant MiniBasic grammar, using the new syntax. Etienne Jean-Baptiste BRIAUD -- Novlog a écrit : > One problem I found with SableCC, the only one in fact, was the poor > grammar base availlable. > Finding a javascript grammar was really hard compared to other > lexer/parser/walker generator. > > If SableCC 4 change the grammar syntax, what is the future for that > already small grammar base ? > Will SableCC 4 be retrocompatible with old grammar ? > Will a converter (of course done with SableCC) from old grammar to new > grammar be available ? > Other idea ? -- Etienne M. Gagnon, Ph.D. SableCC: http://sablecc.org _______________________________________________ SableCC-Discussion mailing list SableCC-Discussion@... http://lists.sablecc.org/listinfo/sablecc-discussion |
|
|
Re: SableCC 4 and old grammar code baseHi Seth,
You should definitely add a link to your book on http://sablecc.org/wiki/DocumentationPage . You should probably create a "Books" section. Have fun! Etienne Bergmann, Seth wrote: > Wow, that's interesting! That means I'll need to update my Compiler Design textbook to use sablecc version 4, but I'll also need to maintain the existing book for people who are still using sablecc version 3. It's a good thing the whole book is available on the web, so that users can select which ever version they wish. > -- Etienne M. Gagnon, Ph.D. SableCC: http://sablecc.org _______________________________________________ SableCC-Discussion mailing list SableCC-Discussion@... http://lists.sablecc.org/listinfo/sablecc-discussion |
|
|
RE: SableCC 4 and old grammar code baseGreat idea, Etienne!
I also included Andrew Appel's book, which uses SableCC, on the wiki page. Sincerely, Seth D. Bergmann Associate Professor Computer Science bergmann@... Rowan University 856-256-4500 ext. 3197 Glassboro NJ 08028 Fax: 856-256-4741 -----Original Message----- From: sablecc-discussion-bounces+bergmann=rowan.edu@... [mailto:sablecc-discussion-bounces+bergmann=rowan.edu@...] On Behalf Of Etienne M. Gagnon Sent: Friday, March 06, 2009 3:49 PM To: Discussion mailing list for the SableCC project Subject: Re: SableCC 4 and old grammar code base Hi Seth, You should definitely add a link to your book on http://sablecc.org/wiki/DocumentationPage . You should probably create a "Books" section. Have fun! Etienne Bergmann, Seth wrote: > Wow, that's interesting! That means I'll need to update my Compiler Design textbook to use sablecc version 4, but I'll also need to maintain the existing book for people who are still using sablecc version 3. It's a good thing the whole book is available on the web, so that users can select which ever version they wish. > -- Etienne M. Gagnon, Ph.D. SableCC: http://sablecc.org _______________________________________________ SableCC-Discussion mailing list SableCC-Discussion@... http://lists.sablecc.org/listinfo/sablecc-discussion |
| Free embeddable forum powered by Nabble | Forum Help |