Re: Fw: Fixing Parser Weirdness

View: New views
1 Messages — Rating Filter:   Alert me  

Parent Message unknown Re: Fw: Fixing Parser Weirdness

by frye-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

With this debugging turned on, I can see that the parser is not liking the whitespace before the left bracket.

login ( user -username root -password password );
debugging... lexer.peek[com.interrupt.bookkeeping.cc.node.TWhitespace /  ] / action[0] / state[12]
ERROR [Thread-3] (Bkell.java:187) - [28,7] expecting: lbracket
com.interrupt.bookkeeping.cc.parser.ParserException: [28,7] expecting: lbracket
    at com.interrupt.bookkeeping.cc.parser.Parser.parse(Parser.java:1025)
    at com.interrupt.bookkeeping.cc.bkell.Bkell.run(Bkell.java:146)
    at java.lang.Thread.run(Thread.java:637)
ERROR [Thread-3] (Util.java:96) - generateBkellException CALLED
ERROR [Thread-3] (Util.java:102) - TOP error message[[28,7] expecting: lbracket]
ERROR [Thread-3] (Util.java:103) - ROOT error message[[28,7] expecting: lbracket]
ERROR [Thread-3] (Bkell.java:189) - <logs xmlns='com/interrupt/logs' id='' ><log xmlns='com/interrupt/logs' level='ERROR' id='' ><logMessages xmlns='com/interrupt/logs' id='' ><logMessage xmlns='com/interrupt/logs' id='' >[28,7] expecting: lbracket</logMessage>
</logMessages>
</log>
</logs>




So I tried removing the whitespace and still got this. It thinks the left bracket is a right bracket..

login( user -username root -password password );
debugging... lexer.peek[com.interrupt.bookkeeping.cc.node.TRbracket / (] / action[0] / state[12]
ERROR [Thread-3] (Bkell.java:187) - [20,6] expecting: lbracket
com.interrupt.bookkeeping.cc.parser.ParserException: [20,6] expecting: lbracket
    at com.interrupt.bookkeeping.cc.parser.Parser.parse(Parser.java:1025)
    at com.interrupt.bookkeeping.cc.bkell.Bkell.run(Bkell.java:146)
    at java.lang.Thread.run(Thread.java:637)
ERROR [Thread-3] (Util.java:96) - generateBkellException CALLED
ERROR [Thread-3] (Util.java:102) - TOP error message[[20,6] expecting: lbracket]
ERROR [Thread-3] (Util.java:103) - ROOT error message[[20,6] expecting: lbracket]
ERROR [Thread-3] (Bkell.java:189) - <logs xmlns='com/interrupt/logs' id='' ><log xmlns='com/interrupt/logs' level='ERROR' id='' ><logMessages xmlns='com/interrupt/logs' id='' ><logMessage xmlns='com/interrupt/logs' id='' >[20,6] expecting: lbracket</logMessage>
</logMessages>
</log>
</logs>



The question for me is, how can the addition of line 98 (in my sablecc file), break the parser so badly?
 97         currency_opt = '-currency' ws+ ( lowercase | uppercase | dash | colon_helper | ats | underscore | digit | dot )*;
 98         returninput_opt = '-returninput' ws+ ( lowercase | uppercase | dash | colon_helper | ats | underscore | digit | dot )*;




I've attached my working files (Parser.java, bkeeping.cc). And you can get the latest git repo from http://repo.or.cz/w/Bookkeeping.git .


Thanks
Tim


On Fri, Oct 9, 2009 at 2:02 PM, Timothy Washington <timothyjwashington@...> wrote:


----- Forwarded Message ----
From: Christopher Van Kirk <chris.vankirk@...>
To: Discussion mailing list for the SableCC project <sablecc-discussion@...>
Sent: Thu, October 8, 2009 10:39:39 PM
Subject: RE: Fixing Parser Weirdness

This is a useful gem from Indrek Mandre that prints out the lexer stream (my version is for .net but you can translate easily). Use this then analyse the lexer output and see if you can match it to your syntax. I suspect you’re getting tokens you don’t expect.

 

class TestLexer

{

  public static void Main(String[] args)

  {

    Lexer l = new Lexer(Console.In);

    while (true)

    {

        Token token = l.Next();

        Console.WriteLine ("Read token '" + token.GetType().Name +

            "', Text = [" + token.Text + "] at [" + token.Line +

            "," + token.Pos + "]");

        if ( token is EOF ) break;

    }

  }

}

 

 

 

 

-----Original Message-----
From: sablecc-discussion-bounces+chris.vankirk=fdcjapan.com@lists.sablecc.org [mailto:...=fdcjapan.com@lists.sablecc.org] On Behalf Of Timothy Washington
Sent:
Friday, October 09, 2009 5:37 AM
To: Discussion mailing list for the SableCC project
Subject: Re: Fixing Parser Weirdness

 

Hey there, thanks for the feedback. Agreed, I think that if I solve 'A', thenthe rest will fall into place. What I am spitting out, the stream as you put it, is the "action array / lexer.peek() / state" as seen from the Parser's point of view.. The output was in bold red in fig. 2 (logs). Line 98 in fig. 1 was what generated that error (everything works if I remove 98). And see line 166 from the Parser.parse method below. That's where I'm putting the debug information. But there's probably a better way of isolating the error.


Parser.java (client won't let me attach java file)
...
 112     @SuppressWarnings("unchecked")
 113     public Start parse() throws ParserException, LexerException, IOException
 114     {
 115         push(0, null, true);
 116         List<Node> ign = null;
 117         while(true)
 118         {
 119             while(index(this.lexer.peek()) == -1)
 120             {
 121                 if(ign == null)
 122                 {
 123                     ign = new LinkedList<Node>();
 124                 }
 125
 126                 ign.add(this.lexer.next());
 127             }
 128
 129             if(ign != null)
 130             {
 131                 this.ignoredTokens.setIn(this.lexer.peek(), ign);
 132                 ign = null;
 133             }
 134
 135             this.last_pos = this.lexer.peek().getPos();
 136             this.last_line = this.lexer.peek().getLine();
 137             this.last_token = this.lexer.peek();
 138
 139             int index = index(this.lexer.peek());
 140             this.action[0] = Parser.actionTable[state()][0][1];
 141             this.action[1] = Parser.actionTable[state()][0][2];
 142
 143             int low = 1;
 144             int high = Parser.actionTable[state()].length - 1;

 145
 146             while(low <= high)
 147             {
 148                 int middle = (low + high) / 2;
 149
 150                 if(index < Parser.actionTable[state()][middle][0])
 151                 {
 152                     high = middle - 1;
 153                 }
 154                 else if(index > Parser.actionTable[state()][middle][0])
 155                 {
 156                     low = middle + 1;
 157                 }
 158                 else
 159                 {
 160                     this.action[0] = Parser.actionTable[state()][middle][1];
 161                     this.action[1] = Parser.actionTable[state()][middle][2];
 162                     break;
 163                 }
 164             }
 165
 166             System.out.println("debugging... action["+this.action[0]+"] / lexer.peek["+this.lexer.peek()+"] / state["+state()+"]");
 167
 168             switch(this.action[0])
 169             {
 170                 case SHIFT:
 171                     {



Thanks
Tim

 


From: Christopher Van Kirk <chris.vankirk@...>
To: Discussion mailing list for the SableCC project <sablecc-discussion@...>
Sent:
Wed, October 7, 2009 8:18:22 PM
Subject: RE: Fixing Parser Weirdness

Have you tried the approach to solving ‘A’ that I suggested in my last posting? If you have and you’re still confused then post the token stream emitted by your lexer and we can go from there.

 

In solving A you may acquire the skills you need to solve B, so I think you should just let B slide for the moment.

 

-----Original Message-----
From: sablecc-discussion-bounces+chris.vankirk=fdcjapan.com@lists.sablecc.org [mailto:...=fdcjapan.com@lists.sablecc.org] On Behalf Of Timothy Washington
Sent:
Thursday, October 08, 2009 7:50 AM
To: Discussion mailing list for the SableCC project
Subject: Re: Fixing Parser Weirdness

 

Hey Chris, thanks for responding to this problem thus far. I've been tied up on other projects, but have come back to these problems. Right now I want to do 2 things:

A) Be able to add a new token (that provides an extra option in the language, -returninput).
B) Allow special characters in an XML block (<mytag value='twashing@...' />).

So looking at problem A), I actually was printing out the characters the Parser was picking from the Lexer. If I add the token on line 98 (see fig. 1), I get the error in fig. 2. Normally, using the expression "login ( user -username root -password password );" works fine. But just with line 98, I get the error that the Parser / Lexer is expecting an opening bracket "(", which is obviously there. So now the question is i) what does it think the "login
(..." is and ii) how does line 98 change a working token to this broken state? I've also re-attached my sablecc file.


97        currency_opt = '-currency' ws+ ( lowercase | uppercase | dash | colon_helper | ats | underscore | digit | dot )*;
98        returninput_opt = '-returninput' ws+ ( lowercase | uppercase | dash | colon_helper | ats | underscore | digit | dot )*;

fig. 1 - bkeeping.cc sablecc file


login ( user -username root -password password );
debugging... action[0] / lexer.peek[login ] / state[0]
debugging... action[3] / lexer.peek[( ] / state[12]

ERROR [Thread-3] (Bkell.java:187) - [14,7] expecting: lbracket
com.interrupt.bookkeeping.cc.parser.ParserException: [14,7] expecting: lbracket
    at com.interrupt.bookkeeping.cc.parser.Parser.parse(Parser.java:1024)
    at com.interrupt.bookkeeping.cc.bkell.Bkell.run(Bkell..java:146)
    at java.lang.Thread.run(Thread.java:637)
ERROR [Thread-3] (Util.java:96) - generateBkellException CALLED
ERROR [Thread-3] (Util.java:102) - TOP error message[[14,7] expecting: lbracket]
ERROR [Thread-3] (Util.java:103) - ROOT error message[[14,7] expecting: lbracket]
ERROR [Thread-3] (Bkell.java:189) - <logs xmlns='com/interrupt/logs' id='' ><log xmlns='com/interrupt/logs' level='ERROR' id='' ><logMessages xmlns='com/interrupt/logs' id='' ><logMessage xmlns='com/interrupt/logs' id='' >[14,7] expecting: lbracket</logMessage>
</logMessages>
</log>

</logs>
fig. 2 - logs


Thanks
Tim



----- Original Message ----
From: Christopher Van Kirk <chris.vankirk@...>
To: Discussion mailing list for the SableCC project <sablecc-discussion@...>
Sent: Thu, September 3, 2009 10:47:11 PM
Subject: Re: Fixing Parser Weirdness

When I have a problem like this, generally I'll print out the token stream for the affected area and try to match it to the production I think the parser should choose. When I say print out the token stream, I mean print out the actual stream as generated by the lexer.

If you do that, I think you will find that you either have grammatical or lexical problems (either the token stream isn't what you think it is, or the syntax isn't what you want it to be). It's good to get into practice doing this sort of thing because this will come up from time to time and it's best to be able to resolve it by yourself.

Also note that SableCC is telling you what token it thinks should come next. From that you can deduce what production is actually being selected, which again should help you identify the problem.

--- On Thu, 9/3/09, Timothy Washington <timothyjwashington@...> wrote:

> From: Timothy Washington <timothyjwashington@...>
> Subject: Re: Fixing Parser Weirdness
> To: "Discussion mailing list for the SableCC project" <sablecc-discussion@...>
> Date: Thursday, September 3, 2009, 11:49 PM
>
> That's actually a really good point - I made that
> change. But the parser is still breaking with the error in
> fig. 1 when
> I do a simple command ( see fig. 0 ). Ultimately I want to
> make a command like in fig. 2. To remove the error, all I
> have to
> do is remove the "returninput_opt" parts ( lines
> 98 & 288 ). This is what I meant by a sensitive parser,
> because I can't see
> how those 2 lines would break an unrelated token /
> production.
>
>
> login ( user -username root -password password );
> fig. 0
>
>
> login ( user -username root -password password );
> ERROR [Thread-3] (Bkell.java:186) - [4,7] expecting:
> lbracket
> com.interrupt.bookkeeping.cc.parser.ParserException:
> [4,7] expecting:
>  lbracket
>    at
> com.interrupt.bookkeeping.cc.parser.Parser.parse(Parser.java:1022)
>    at
> com.interrupt.bookkeeping.cc..bkell.Bkell.run(Bkell.java:145)
>    at
> java..lang.Thread.run(Thread.java:637)
> ERROR [Thread-3] (Bkell.java:188) - <logs
> xmlns='com/interrupt/logs' id='' ><log
> xmlns='com/interrupt/logs' level='ERROR'
> id='' ><logMessages
> xmlns='com/interrupt/logs' id=''
> ><logMessage xmlns='com/interrupt/logs'
> id='' >[4,7] expecting:
> lbracket</logMessage>
> </logMessages>
> </log>
> </logs>
>
> ERROR [Thread-3] (Bkell.java:186) - [1,2] expecting:
> 'var', 'create', 'add',
> 'update', 'remove', 'reverse',
> 'find', 'list', 'print',
> 'commit', 'load', 'login',
> 'logout', 'exit'
> com.interrupt.bookkeeping.cc.parser.ParserException:
> [1,2] expecting: 'var', 'create',
> 'add', 'update', 'remove',
>  'reverse', 'find', 'list',
> 'print', 'commit', 'load',
> 'login', 'logout',
> 'exit'
>    at
> com.interrupt.bookkeeping.cc.parser.Parser.parse(Parser.java:1022)
>    at
> com.interrupt..bookkeeping.cc.bkell.Bkell.run(Bkell.java:145)
>    at
> java.lang.Thread.run(Thread.java:637)
> ERROR [Thread-3] (Bkell.java:188) - <logs
> xmlns='com/interrupt/logs' id='' ><log
> xmlns='com/interrupt/logs' level='ERROR'
> id='' ><logMessages
> xmlns='com/interrupt/logs' id=''
> ><logMessage xmlns='com/interrupt/logs'
> id='' >[1,2] expecting: 'var',
> 'create', 'add', 'update',
> 'remove', 'reverse', 'find',
> 'list', 'print', 'commit',
> 'load', 'login', 'logout',
> 'exit'</logMessage>
> </logMessages>
> </log>
> </logs>
> ...
>
> fig. 1
>
>
> var aauthUsers = add ( ( load ( `/system[
> @id='main.system' ]` ) )
>           
>  <user xmlns='com/interrupt/bookkeeping/users'
> id='seven' username='seven'
> password='seven' logintimeout='600000' >
>               
> <profileDetails
> xmlns='com/interrupt/bookkeeping/users'
> id='user.details' >
>               
>    <profileDetail
> xmlns='com/interrupt/bookkeeping/users'
> id='firstname' name='first.name'
> value='seven' />
>               
>    <profileDetail
> xmlns='com/interrupt/bookkeeping/users'
> id='lastname' name='last.name'
> value='seven' />
>               
>    <profileDetail
> xmlns='com/interrupt/bookkeeping/users'
> id='email' name='email'
> value='twashing@...'
>  />
>               
>    <profileDetail
> xmlns='com/interrupt/bookkeeping/users'
> id='country' name='country'
> value='U.S.A' />
>           
>    </profileDetails>
>           
> </user>
>            ,
>           
> -returninput
> true
>        );
>
> fig. 2
>
>
> Tim
>
>
>
> From: Christopher Van Kirk
> <chris.vankirk@...>
> To:
> Discussion mailing list for the SableCC project
> <sablecc-discussion@...>
> Sent:
> Thursday, September 3, 2009 12:08:21 AM
> Subject: RE:
> Fixing Parser Weirdness
>
>
>
>

>
>

>
>
>
>
>
>
>
>
>
> This is
> probably your problem:
>

>
> dash = '-';
>
> .
>
> .
>
> .
>
> entry_opt = '-entry' ws+ (
> lowercase | uppercase | dash | colon_helper |  ats |
> underscore | digit | dot
> )*;
>

>
> You’ve
> defined a dash token in the lexer
> section, but you’re expecting to see it joined to a word
> in the parser,
> Remember that what comes out of the Tokens section is a
> stream of tokens you
> have defined there, so what your parser will see
> is
>

>
> dash
> entry …
> not -entry
>

>
> Try
> changing your production to something
> like:
>

>
> entry_opt = dash
> 'entry' ws+ (
> lowercase | uppercase | dash | colon_helper |  ats |
> underscore | digit | dot
> )*;
>

>
> Cheers,
>

>
> Chris…
>

>
> -----Original
> Message-----
>
> From:
> sablecc-discussion-bounces+chris.vankirk=fdcjapan.com@...
> [mailto:...=fdcjapan.com@...]
> On Behalf Of
> Timothy Washington
>
> Sent:
> Wednesday,
>  September 02, 2009
> 11:43
> PM
>
> To:
> sablecc-discussion@...
>
> Subject:
> Fixing Parser Weirdness
>

>
>
>
>
>
> Hey
> all, I've implemented a language using sablecc that
> looks like in fig. 1.
>
>
>
> command(
>
>    ( <context/> )
>
>        <input/>,
>
>        `/input[
> @att='value' ]`,
>
>        -inputoption value
>
> );
>
> fig. 1
>
>
>
>
>
>
>
> A) I'm trying to implement an input option, "
> -inputoption value ",
> but if I try to implement the command in fig. 2, I get the
> error
>
> seen in fig. 4.
>
>
>
> var removedE = remove ( ( load( `/system[
> @id="main.system" ]/groups[
> @id="main.groups" ]/group[
> @id="seven.group" ]/bookkeeping[
> @id="main.bookkeeping" ]/journals[
> @id="main.journals"
> ]/journal[ @id="generalledger" ]/entries[
> @id="main.entries" ]` ) )
>
>        <entry
> xmlns='com/interrupt/bookkeeping/journal'
> id='qwer' entrynum='' state=''
> journalid='' date='02022006'
> currency='CDN' > 
>
>           
> <debit
> xmlns='com/interrupt/bookkeeping/account'
> id='asdf' amount='11.00' entryid=''
> accountid='1' account='office equipment'
> currency='CDN' /> 
>
>           
> <debit
> xmlns='com/interrupt/bookkeeping/account'
> id='zxcv' amount='1.50' entryid=''
> accountid='2' account='tax'
> currency='CDN' /> 
>
>           
> <credit
> xmlns='com/interrupt/bookkeeping/account'
> id='tyui' amount='12.50' entryid=''
> accountid='3' account='bank'
> currency='CDN' /> 
>
>        </entry>
>
>        ,
>
>       
> -returninput
> true
>
>    );
>
> fig. 2
>
>
>
>
>
>
>
> B) I'm also trying to get SableCC to ignore special
> characters between single
> and double quotes ( ' " ). So for example, I want
>
>
> to be able to say <profileDetail
> value='twashing@...' />, but the
> parser dies with the error in fig. 5.
>
>
>
> var aauthUsers = add ( ( load ( `/system[
> @id='main.system' ]` ) )
>
>           
> <user
> xmlns='com/interrupt/bookkeeping/users'
> id='seven' username='seven'
> password='seven' logintimeout='600000'
> >
>
>               
> <profileDetails
> xmlns='com/interrupt/bookkeeping/users'
> id='user.details'
> >
>
>               
>    <profileDetail
> xmlns='com/interrupt/bookkeeping/users'
> id='firstname' name='first.name'
> value='seven' />
>
>               
>    <profileDetail
> xmlns='com/interrupt/bookkeeping/users'
> id='lastname' name='last.name'
> value='seven' />
>
>               
>    <profileDetail
> xmlns='com/interrupt/bookkeeping/users'
> id='email' name='email'
> value='twashing@...' />
>
>               
>    <profileDetail
> xmlns='com/interrupt/bookkeeping/users'
> id='country' name='country'
> value='U.S.A' />
>
>           
>    </profileDetails>
>
>           
> </user>
>
>        );
>
> fig. 3
>
>
>
>
>
> It feels like Sablecc is very sensitive in the order of
> tokens. Can anyone help
> me isolate and fix this problem. I've attached
>
> my sablecc file.
>
>
>
>
>
> Thanks in advance
>
> Tim
>
>
>
>
>
>
>
> com.interrupt.bookkeeping.cc.parser.ParserException: [1,1]
> expecting: 'var',
> 'create', 'add', 'update',
> 'remove', 'reverse', 'find',
> 'list', 'print',
> 'commit', 'load', 'login',
> 'logout', 'exit'
>
>    at
> com.interrupt.bookkeeping.cc.parser.Parser.parse(Parser.java:1028)
>
>    at
> com.interrupt.bookkeeping.cc.bkell.Bkell.run(Bkell.java:145)
>
>    at
> java.lang.Thread.run(Thread.java:637)
>
> fig. 4
>
>
>
>
>
> com.interrupt.bookkeeping.cc.parser.ParserException: [19,3]
> expecting:
> 'create', 'add', 'update',
> 'remove', 'reverse', 'find',
> 'list', 'print',
> 'commit', 'load', 'login',
> 'logout', 'exit', 'system',
> 'debit', 'credit',
> 'entry', 'entries', 'journal',
> 'journals', 'transaction',
> 'account',
> 'accounts', 'user', 'users',
> 'group', 'groups', 'allowedActions',
> 'command',
> 'profileDetails', 'profileDetail',
> 'userSession', '<', rbracket, atsign,
> '`'
>
>    at
> com.interrupt.bookkeeping.cc.parser.Parser.parse(Parser.java:1022)
>
>    at
> com.interrupt.bookkeeping.cc.bkell.Bkell.run(Bkell.java:145)
>
>    at
> java.lang.Thread.run(Thread.java:637)
>
> fig. 5
>
>
>
>
>
>
>
>
>
>
>

>
>
>
>
>
>
>
> Looking for the
> perfect gift? Give the gift of
> Flickr!
>
>
>
>
>
>     
> Looking for the perfect gift? Give
> the gift of Flickr!
> -----Inline Attachment Follows-----
>
> _______________________________________________
> SableCC-Discussion mailing list
> SableCC-Discussion@...
> http://lists.sablecc.org/listinfo/sablecc-discussion
>

_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

 

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 8.5.421 / Virus Database: 270.14.3/2414 - Release Date:
10/07/09 20:49:00

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 8.5.421 / Virus Database: 270.14.3/2414 - Release Date: 10/08/09 18:33:00



Be smarter than spam. See how smart SpamGuard is at giving junk email the boot with the All-new Yahoo! Mail

_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion





_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

bkeeping.cc (24K) Download Attachment
Parser.java (238K) Download Attachment