|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
problem where I can't ignoring text / tokensHey there, I'm having a problem where I want the same behaviour from the 'xpath_chars' token as from the 'word' token. Both 'word' and 'xpath_chars' are used in my grammar file (see fig. 0). GOOD -> In the Productions below, when I use a 'word' token in between quotes, all the characters are ignored (see fig. 1 - or at least given no meaning in the 'freetext' state). BAD -> But when I try that same behaviour with my xpath_chars token, the parser fails (see fig. 2). But it does work when I only use 'word' tokens (see fig. 3). QUESTION -> Is there a way I can ignore any character in between backquotes ' ` '? I just can't get this one to work (grammar file
attached). Tokens ... {bkeeping -> freetext, freetext -> bkeeping} quote = ( '"' | ''' ); {bkeeping -> freetext, freetext -> bkeeping} backquote = '`'; word = ( lowercase | uppercase | dash | underscore | digit | dot )+; xpath_chars = ( ats | forwardslash | colon_helper | left_bracket | right_bracket | lsquare_bracket | rsquare_bracket | equals_helper | double_quote | single_quote ); xmlns = 'xmlns'; decl_xml = 'xml'; decl_dtd = 'DOCTYPE'; eoll = (cr | lf | cr lf)?; Productions ... fig. 0 create ( <journal id="thing-thing" /> ); // [ok] word is used in in between the double quotes " fig. 1 load ( `/system[@id='main.system']` ); // [x] xpath goes in between the back quotes ` ERROR [Thread-3] (Bkell.java:161) - [11,9] expecting: '`', word, xpath chars com.interrupt.bookkeeping.cc.parser.ParserException: [11,9] expecting: '`', word, xpath chars at com.interrupt.bookkeeping.cc.parser.Parser.parse(Parser.java:998) at com.interrupt.bookkeeping.cc.bkell.Bkell.run(Bkell.java:135) at java.lang.Thread.run(Thread.java:613) fig. 2 load ( `` ); load ( `asdf` ); fig. 3 Thanks in
advance Tim Be smarter than spam. See how smart SpamGuard is at giving junk email the boot with the All-new Yahoo! Mail _______________________________________________ SableCC-Discussion mailing list SableCC-Discussion@... http://lists.sablecc.org/listinfo/sablecc-discussion |
|||||||
|
|
Re: problem where I can't ignoring text / tokensThe easiest way to investigate lexer problems is to use a debug lexer such as proposed in: http://lists.sablecc.org/pipermail/sablecc-discussion/msg00311.html I've attached a minimal Main.java file to test your grammar with an anonymous debug lexer. Running it on your example reveals the problem: the "/" is matched to a "TFslash" instead of a "TXpathChars". Do not forget that, with SableCC 2 & 3, if no state is specified before a token, then this token is matched in all states. Also, if two tokens match a string, the one that appears first wins. Have fun! Etienne Timothy Washington wrote:
-- Etienne M. Gagnon, Ph.D. SableCC: http://sablecc.org import com.interrupt.bookkeeping.cc.parser.*; import com.interrupt.bookkeeping.cc.lexer.*; import com.interrupt.bookkeeping.cc.node.*; import com.interrupt.bookkeeping.cc.analysis.*; import java.io.*; public class Main { public static void main( String[] args) throws Exception { new Parser(new Lexer(new PushbackReader( new InputStreamReader(System.in), 1024)) { protected void filter() { System.out.println(token.getClass() + ", state : " + state.id() + ", text : [" + token.getText() + "]"); } }).parse(); } } _______________________________________________ SableCC-Discussion mailing list SableCC-Discussion@... http://lists.sablecc.org/listinfo/sablecc-discussion |
|||||||
|
|
Re: problem where I can't ignoring text / tokensYessss, thanks very much. I can now mix the languages like in the statements below ( 'mylang', xml & xpath ). It turns out that the Lexer debugger gave me the info I needed to sort through the token mismatches. login ( user -username root -password password ); create ( <journal xmlns='com/interrupt/bookkeeping/journal' id='new.journal' /> ); load ( `/` ); load ( `/system[ @id='main.system' and date='01/01/2009' ]/groups[@id='main.groups']` ); load (
<journal xmlns='com/interrupt/bookkeeping/journal' id='new.journal' /> ); commit ( (`/system[@id='main.system']/groups[@id='main.groups']/group[@id='webkell']/bookkeeping[@id='main.bookkeeping']/journals[@id='main.journals']`) <journal xmlns='com/interrupt/bookkeeping/journal' id='new.journal' /> ); Cheers Tim From: Etienne M. Gagnon <egagnon@...> To: Discussion mailing list for the SableCC project <sablecc-discussion@...> Sent: Wednesday, February 11, 2009 4:40:13 PM Subject: Re: problem where I can't ignoring text / tokens Hi Timothy, The easiest way to investigate lexer problems is to use a debug lexer such as proposed in: http://lists.sablecc.org/pipermail/sablecc-discussion/msg00311.html I've attached a minimal Main.java file to test your grammar with an anonymous debug lexer. Running it on your example reveals the problem: the "/" is matched to a "TFslash" instead of a "TXpathChars". Do not forget that, with SableCC 2 & 3, if no state is specified before a token, then this token is matched in all states. Also, if two tokens match a string, the one that appears first wins. Have fun! Etienne Timothy Washington wrote:
-- Etienne M. Gagnon, Ph.D. SableCC: http://sablecc.org
|
_______________________________________________ SableCC-Discussion mailing list SableCC-Discussion@... http://lists.sablecc.org/listinfo/sablecc-discussion
|
|
Shift reduce problem?Hi there. I'm trying to implement the null coalescing operator in a little expression language we have, and I'm running into a shift-reduce error that I'm not understanding. The grammar in question is below, shortened a bit to focus on the problem at hand. We're still on the old 3.2 platform, if that helps. Error msg follows: shift/reduce conflict in state [stack: PCNullCoalesceExpr *] on TTQuestion in { [ PCExpr = PCNullCoalesceExpr * TTQuestion PCExpr TTColon PCExpr ] (shift), [ PCExpr = PCNullCoalesceExpr * ] followed by TTQuestion (reduce) } at org.sablecc.sablecc.GenParser.caseStart(GenParser.java:227) at org.sablecc.sablecc.node.Start.apply(Start.java:33) at org.sablecc.sablecc.SableCC.processGrammar(SableCC.java:391) at org.sablecc.sablecc.SableCC.processGrammar(SableCC.java:280) at org.sablecc.sablecc.SableCC.main(SableCC.java:220) and the grammar: -------------------------- Package Expressions; Helpers unicode_input_character = [ 0 .. 0xffff ]; tab=0x0009; lf=0x000a; cr=0x000d; eol = [[cr + lf] + [cr + lf]]; white = [[' ' + tab] + eol]; input_character = [ unicode_input_character - [ cr + lf ] ]; escape_sequence = '\b' | '\t' | '\n' | '\f' | '\r' | '\"' | '\' | ''' | '\\'; string_character = [ input_character - [ '"' + '\' ] ] | escape_sequence; single_character = [ input_character - [ ''' + '\' ] ] ; alpha = [['A'..'Z'] + ['a'..'z']]; numeral = ['0'..'9']; alphanumeric = [numeral + alpha]; // override each letter of the alphabet to ensure case insensitivity for keywords. a = 'a' | 'A'; b = 'b' | 'B'; c = 'c' | 'C'; d = 'd' | 'D'; e = 'e' | 'E'; f = 'f' | 'F'; g = 'g' | 'G'; h = 'h' | 'H'; i = 'i' | 'I'; j = 'j' | 'J'; k = 'k' | 'K'; l = 'l' | 'L'; m = 'm' | 'M'; n = 'n' | 'N'; o = 'o' | 'O'; p = 'p' | 'P'; q = 'q' | 'Q'; r = 'r' | 'R'; s = 's' | 'S'; t = 't' | 'T'; u = 'u' | 'U'; v = 'v' | 'V'; w = 'w' | 'W'; x = 'x' | 'X'; y = 'y' | 'Y'; z = 'z' | 'Z'; States base; Tokens white = white+; t_double_question = '??'; t_question = '?'; t_colon = ':'; t_shim = s h i m; Ignored Tokens white, line_comment, multiline_comment; Productions c_expr {-> a_expr } = [expr]:c_conditional_expr {-> New a_expr( expr.a_conditional_expr ) } ; c_conditional_expr {-> a_conditional_expr } = { q_passthrough } [expr]:c_null_coalesce_expr {-> New a_conditional_expr.passthrough( expr.a_null_coalesce_expr ) } | { q_conditional } [if_expr]:c_null_coalesce_expr t_question [true_expr]:c_expr t_colon [false_expr]:c_expr {-> New a_conditional_expr.conditional( if_expr.a_null_coalesce_expr, true_expr.a_expr, false_expr.a_expr ) } ; c_null_coalesce_expr {-> a_null_coalesce_expr } = { q_passthrough } [expr]:c_conditional_or_expr {-> New a_null_coalesce_expr.passthrough( expr.a_conditional_or_expr ) } | { q_null_coalesce } [left]:c_conditional_or_expr t_double_question [right]:c_expr {-> New a_null_coalesce_expr.coalesce( left.a_conditional_or_expr, right.a_expr ) } ; c_conditional_or_expr {-> a_conditional_or_expr } = { temp } t_shim {-> New a_conditional_or_expr.shim( t_abs ) } ; Abstract Syntax Tree a_expr = [expr]:a_conditional_expr ; a_conditional_expr = { passthrough } [expr]:a_null_coalesce_expr | { conditional } [if_expr]:a_null_coalesce_expr [true_expr]:a_expr [false_expr]:a_expr ; a_null_coalesce_expr = { passthrough } [expr]:a_conditional_or_expr | { coalesce } [left]:a_conditional_or_expr [right]:a_expr ; a_conditional_or_expr = { shim } t_shim ; _______________________________________________ SableCC-Discussion mailing list SableCC-Discussion@... http://lists.sablecc.org/listinfo/sablecc-discussion
|
|
How to write an unambiguous expression grammar [was: Shift reduce problem?]Hi Chris,
The conflict is due to the well known expression grammar ambiguity. Here's the idea. A typical expression grammar looks like: exp = {add} exp plus exp | {sub} exp minus exp | {mul} exp star exp | {div} exp slash exp | {num} number; This grammar is ambiguous for 2 reasons: 1- operator precedence : 5 + 2 * 3 usually means 5 + (2 * 3 ), not (5 + 2) * 3 2- associativity : 5 - 3 - 2 usually means (5 - 3) - 2, not 5 - (3 - 2) The grammar allows for all these interpretations (resulting in different syntax trees for the same input text). This is obviously undesirable! We don't want the parser to randomly select one interpretation: 5 + 2 * 3 == 11 or 21 depending on some random parsing choice... To solve the ambiguity, operator precedence and associativity must be first determined. So, let's decide it for the above grammar. Priority // highest to lowest precedence Left mul, div; Left add, sub; The trick, now, is to rewrite the grammar using one production per precedence level, starting with the LOWEST priority. Left associativity corresponds to left recursion, and right associativity to right recursion. exp = // lowest priority, left associative {add} exp plus factor | {sub} exp minus factor | {simple} factor; factor = // left associative {mul} factor star term | {div} factor div term | {simple} term; term = {num} number | {par} l_par exp r_par; Notes: 1- You must add a "simple" alternative at each precedence level for expressions that do not use current precedence operators. E.g. 5 * 2 is an expression without addition nor subtraction. 2- The atomic production "term" may not be left nor right recursive. 3- Usually, the parenthesized expression is added as a term, for expressiveness. If I want to express (5 + 2) * 3, I need parentheses. Conveniently, "l_par exp r_par" is neither left nor right recursive. So, it is a valid term. 4- The leftmost and rightmost element of an alternative may only be the current production (recursion) or the next-level production. Your grammar breaks rule "4-", so you are getting a conflict message related to some resulting ambiguity. By applying the above approach, the conflict should disappear (e.g. decide on precedence and associativity of t_question and t_double_question and probably add a parenthesized expression term, in addition to t_shim, for expressiveness). Have fun! Etienne Christopher Van Kirk wrote: > Hi there. > > I'm trying to implement the null coalescing operator in a little expression language we have, and I'm running into a shift-reduce error that I'm not understanding. > > The grammar in question is below, shortened a bit to focus on the problem at hand. We're still on the old 3.2 platform, if that helps. > > Error msg follows: > > shift/reduce conflict in state [stack: PCNullCoalesceExpr *] on TTQuestion in { > [ PCExpr = PCNullCoalesceExpr * TTQuestion PCExpr TTColon PCExpr ] (shift), > [ PCExpr = PCNullCoalesceExpr * ] followed by TTQuestion (reduce) > } > [...] Etienne M. Gagnon, Ph.D. SableCC: http://sablecc.org _______________________________________________ SableCC-Discussion mailing list SableCC-Discussion@... http://lists.sablecc.org/listinfo/sablecc-discussion
|
|
Re: How to write an unambiguous expression grammar [was: Shift reduce problem?]Thanks Etienne, that solved it. The null coalesce production looped back to expr when it should have recursed onto itself. Much appreciated as usual! --- On Thu, 2/26/09, Etienne M. Gagnon <egagnon@...> wrote: > From: Etienne M. Gagnon <egagnon@...> > Subject: How to write an unambiguous expression grammar [was: Shift reduce problem?] > To: "Discussion mailing list for the SableCC project" <sablecc-discussion@...> > Date: Thursday, February 26, 2009, 11:14 PM > Hi Chris, > > The conflict is due to the well known expression grammar > ambiguity. > > Here's the idea. A typical expression grammar looks > like: > > exp = > {add} exp plus exp | > {sub} exp minus exp | > {mul} exp star exp | > {div} exp slash exp | > {num} number; > > This grammar is ambiguous for 2 reasons: > 1- operator precedence : 5 + 2 * 3 usually means 5 + (2 * 3 > ), not (5 + > 2) * 3 > 2- associativity : 5 - 3 - 2 usually means (5 - 3) - 2, not > 5 - (3 - 2) > > The grammar allows for all these interpretations (resulting > in different > syntax trees for the same input text). This is obviously > undesirable! We > don't want the parser to randomly select one > interpretation: 5 + 2 * 3 > == 11 or 21 depending on some random parsing choice... > > To solve the ambiguity, operator precedence and > associativity must be > first determined. So, let's decide it for the above > grammar. > > Priority // highest to lowest precedence > Left mul, div; > Left add, sub; > > The trick, now, is to rewrite the grammar using one > production per > precedence level, starting with the LOWEST priority. Left > associativity > corresponds to left recursion, and right associativity to > right recursion. > > exp = // lowest priority, left associative > {add} exp plus factor | > {sub} exp minus factor | > {simple} factor; > > factor = // left associative > {mul} factor star term | > {div} factor div term | > {simple} term; > > term = > {num} number | > {par} l_par exp r_par; > > Notes: > 1- You must add a "simple" alternative at each > precedence level for > expressions that do not use current precedence operators. > E.g. 5 * 2 is > an expression without addition nor subtraction. > 2- The atomic production "term" may not be left > nor right recursive. > 3- Usually, the parenthesized expression is added as a > term, for > expressiveness. If I want to express (5 + 2) * 3, I need > parentheses. > Conveniently, "l_par exp r_par" is neither left > nor right recursive. So, > it is a valid term. > 4- The leftmost and rightmost element of an alternative may > only be the > current production (recursion) or the next-level > production. > > Your grammar breaks rule "4-", so you are getting > a conflict message > related to some resulting ambiguity. > > By applying the above approach, the conflict should > disappear (e.g. > decide on precedence and associativity of t_question and > t_double_question and probably add a parenthesized > expression term, in > addition to t_shim, for expressiveness). > > Have fun! > > Etienne > > Christopher Van Kirk wrote: > > Hi there. > > > > I'm trying to implement the null coalescing > operator in a little expression language we have, and > I'm running into a shift-reduce error that I'm not > understanding. > > > > The grammar in question is below, shortened a bit to > focus on the problem at hand. We're still on the old 3.2 > platform, if that helps. > > > > Error msg follows: > > > > shift/reduce conflict in state [stack: > PCNullCoalesceExpr *] on TTQuestion in { > > [ PCExpr = PCNullCoalesceExpr * TTQuestion PCExpr > TTColon PCExpr ] (shift), > > [ PCExpr = PCNullCoalesceExpr * ] followed by > TTQuestion (reduce) > > } > > [...] > > -- > Etienne M. Gagnon, Ph.D. > SableCC: > http://sablecc.org > > > _______________________________________________ > SableCC-Discussion mailing list > SableCC-Discussion@... > http://lists.sablecc.org/listinfo/sablecc-discussion _______________________________________________ SableCC-Discussion mailing list SableCC-Discussion@... http://lists.sablecc.org/listinfo/sablecc-discussion |
| Free embeddable forum powered by Nabble | Forum Help |