error: alternative doesn't exist in section AST (even though it looks like it does)

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

error: alternative doesn't exist in section AST (even though it looks like it does)

by Ben Evans-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I have a grammar, with a couple of productions like this:

    goal {-> program} =
        kw_package barename tok_semi declaration_list
            {-> New program(barename, declaration_list.declarations)}
      ;

    declaration_list {-> declarations} =
        declaration* {-> New declarations([declaration])}
    ;

    // A top-level definition is either a statement in package context, a use statement, or a subroutine declaration (ie a method definition in Java)
    declaration {-> declaration}  =
        {use} kw_use fq_java_classname tok_semi
            {-> New declaration.use_decl(fq_java_classname)} |
        {sub} kw_sub barename block
            {-> New declaration.function(barename, New statement.block([block.statement]))} |
// FIXME - add statements in package context
//        {global} global_declaration
    ;

...

Abstract Syntax Tree

    program =
        barename declarations;

    declarations =
        declaration*;

    declaration =
        {inherit} constant |    /* multi_string becomes string. */
        {use_decl} fq_java_classname |
        {variable} declarator |
    // FIXME Sub prototype?
    // [parameters]:declaration*
        {function} barename statement
    ;
   
    declarator =
        {plain} identifier |
        {init} identifier expression
    ;

...

and am getting this error:

java.lang.RuntimeException: [0,0] alternative ADeclaration doesn't exist in section AST

Can someone help me untangle what I'm doing wrong here?

Thanks,

Ben

_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

Parent Message unknown Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Christopher Van Kirk :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Seems like you're attempting a CST to AST transformation to declaration with no subtype info specified like:

Concrete section

concrete {-> abstract } =
     token {-> abstract( token ) } // wrong
||   token {-> abstract.alt1( token ) } // right
;


Abstract section

abstract =
     { alt1 } token
||   { alt2 } token token;

--- On Wed, 5/20/09, Ben Evans <benjamin.john.evans@...> wrote:

> From: Ben Evans <benjamin.john.evans@...>
> Subject: error: alternative doesn't exist in section AST (even though it looks like it does)
> To: sablecc-discussion@...
> Date: Wednesday, May 20, 2009, 1:11 AM
> Hi,
>
> I have a grammar, with a couple of productions like this:
>
>     goal {-> program} =
>         kw_package barename tok_semi
> declaration_list
>             {-> New program(barename,
> declaration_list.declarations)}
>
>       ;
>
>     declaration_list {-> declarations} =
>         declaration* {-> New
> declarations([declaration])}
>     ;
>
>     // A top-level definition is either a statement in
> package context, a use statement, or a subroutine
> declaration (ie a method definition in Java)
>
>     declaration {-> declaration}  =
>         {use} kw_use fq_java_classname tok_semi
>             {-> New
> declaration.use_decl(fq_java_classname)} |
>         {sub} kw_sub barename block
>             {-> New
> declaration.function(barename, New
> statement.block([block.statement]))} |
>
> // FIXME - add statements in package context
> //        {global} global_declaration
>     ;
>
> ...
>
> Abstract Syntax Tree
>
>
>     program =
>         barename declarations;
>
>     declarations =
>         declaration*;
>
>     declaration =
>         {inherit} constant |    /* multi_string
> becomes string. */
>         {use_decl} fq_java_classname |
>
>         {variable} declarator |
>     // FIXME Sub prototype?
>     // [parameters]:declaration*
>         {function} barename statement
>     ;
>    
>     declarator =
>         {plain} identifier |
>         {init} identifier expression
>
>     ;
>
> ...
>
> and am getting this error:
>
> java.lang.RuntimeException: [0,0] alternative ADeclaration
> doesn't exist in section AST
>
> Can someone help me untangle what I'm doing wrong
> here?
>
>
> Thanks,
>
> Ben
>
>
> -----Inline Attachment Follows-----
>
> _______________________________________________
> SableCC-Discussion mailing list
> SableCC-Discussion@...
> http://lists.sablecc.org/listinfo/sablecc-discussion
>

_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Etienne M. Gagnon :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Ben,

I agree with Chris.

Another possibility is that you forgot to name and transform some CST
alternative:

declaration
 =
  {use} ...
   {-> ...}
 |
  ...
 |
  alternative name forgotten with no transformation // unnamed and not transformed
 ;
...
Abstract Syntax Tree
...


Etienne

Christopher Van Kirk wrote:
> Seems like you're attempting a CST to AST transformation to declaration with no subtype info specified like:
> ...
>  
--
Etienne M. Gagnon, Ph.D.
SableCC:                                            http://sablecc.org




_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

signature.asc (265 bytes) Download Attachment

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Ben Evans-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Etienne,

It turns out the problem was (I think) a special case of what you were referring to here - a CST alternative like:

        {sub} kw_sub barename block
            {-> New declaration.function(barename, New statement.block([block.statement]))} |
//
//    stuff commented out for now
//
         ;

the problem seems to be the trailing | followed by comment followed by ;

I'm making progress again now.

Thanks,

Ben

On Wed, May 20, 2009 at 12:01 AM, Etienne M. Gagnon <egagnon@...> wrote:
Hi Ben,

I agree with Chris.

Another possibility is that you forgot to name and transform some CST
alternative:

declaration
 =
 {use} ...
  {-> ...}
 |
 ...
 |
 alternative name forgotten with no transformation // unnamed and not transformed
 ;
...
Abstract Syntax Tree
...


Etienne

Christopher Van Kirk wrote:
> Seems like you're attempting a CST to AST transformation to declaration with no subtype info specified like:
> ...
>
--
Etienne M. Gagnon, Ph.D.
SableCC:                                            http://sablecc.org



_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion



_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Niklas Matthies :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Speaking of which, it would be nice for the SableCC 4 grammar to be
"rearrange friendly". Currently all alternatives are followed by '|',
except the last one which is followed by ';'. This means that when
adding/moving/removing alternatives, one has to take care to adjust
this ending token in addition to just copying/cutting/pasting lines.

In general it is a good language design guideline for list separators
to also be allowed to occur at the end of the list, so that all list
elements can be written uniformly. Java for example has done this for
the enum class syntax, where it's legal to write:

    enum MyEnum
    {
        FOO,
        BAR,
        BAZ,    // <- comma is allowed after last constant
        ;
        ...
    }

In SableCC there's a potential collision with empty alternatives of
course. But a viable solution could be to disallow unnamed empty
alternatives -- which are probably likely to be errors anyway.

-- Niklas Matthies

On Sun 2009-05-24 at 11:42h, Ben Evans wrote on sablecc-discussion:

> Hi Etienne,
>
> It turns out the problem was (I think) a special case of what you were
> referring to here - a CST alternative like:
>
>         {sub} kw_sub barename block
>             {-> New declaration.function(barename, New
> statement.block([block.statement]))} |
> //
> //    stuff commented out for now
> //
>          ;
>
> the problem seems to be the trailing | followed by comment followed by ;
>
> I'm making progress again now.
>
> Thanks,
>
> Ben
>
> On Wed, May 20, 2009 at 12:01 AM, Etienne M. Gagnon <egagnon@...>wrote:
>
> > Hi Ben,
> >
> > I agree with Chris.
> >
> > Another possibility is that you forgot to name and transform some CST
> > alternative:
> >
> > declaration
> >  =
> >  {use} ...
> >   {-> ...}
> >  |
> >  ...
> >  |
> >  alternative name forgotten with no transformation // unnamed and not
> > transformed
> >  ;
> > ...
> > Abstract Syntax Tree
> > ...
> >
> >
> > Etienne
> >
> > Christopher Van Kirk wrote:
> > > Seems like you're attempting a CST to AST transformation to declaration
> > with no subtype info specified like:
> > > ...
> > >
> > --
> > Etienne M. Gagnon, Ph.D.
> > SableCC:                                            http://sablecc.org
> >
> >
> >
> > _______________________________________________
> > SableCC-Discussion mailing list
> > SableCC-Discussion@...
> > http://lists.sablecc.org/listinfo/sablecc-discussion
> >
> >

> _______________________________________________
> SableCC-Discussion mailing list
> SableCC-Discussion@...
> http://lists.sablecc.org/listinfo/sablecc-discussion



_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Etienne M. Gagnon :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Niklas,

See below.

In SableCC there's a potential collision with empty alternatives of
course. But a viable solution could be to disallow unnamed empty
alternatives -- which are probably likely to be errors anyway.
  

That's an interesting proposal, but I don't think that it could work within SableCC 4. Let me explain.

SableCC 4 allows for multiple unnamed alternatives in a single production. This is in line with SableCC 4's new philosophy of: you only need to name it if you need to refer to it and its default name is ambiguous (or missing).

Example:
some_prod = some_prod child | ;  // notice the empty second unnamed alternative
child = {minor} ... | {adult} ...;
In SableCC 3, this would raise an error, as two alternatives of some_prod are unnamed.

SableCC 4 allows for it. Now, once the AST is built, it won't have explicit types for both alternatives. In other words, you won't be able to access some_prod alternatives using an in/out/caseASomeProd() method. But, and that's the idea, the AST will contain the nodes, so you'll be able to use in/out/caseAMinorChild() and, therefore, have access to all "child" nodes of the AST.

So many people complained about how SableCC was painful to use because it forced you to name everything. This prompted me to relax SableCC 4's naming requirements while preserving good behavior: instead of allowing for index-based access, SableCC keeps named-based access but denies direct access when no unambiguous name is provided.

Of course, this philosophy gets in direct conflict with your otherwise quite interesting proposal, as illustrated by the example above.

I think that keeping relaxed naming will be of greater benefit to SableCC 4's users. What do you think?

Etienne
-- 
Etienne M. Gagnon, Ph.D.
SableCC:                                            http://sablecc.org


_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

signature.asc (265 bytes) Download Attachment

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Niklas Matthies :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun 2009-05-24 at 20:37h, Etienne M. Gagnon wrote on sablecc-discussion:
:
> SableCC 4 allows for multiple unnamed alternatives in a single
> production. This is in line with SableCC 4's new philosophy of: /you
> only need to name it if you need to refer to it and its default name is
> ambiguous (or missing)/.

I see, that's certainly desirable of course.

:
> I think that keeping relaxed naming will be of greater benefit to
> SableCC 4's users. What do you think?

I think that it may still be okay to require that empty alternatives
have to be made more explicit in some way. This would also benefit
readers of the grammar. Stuff like "= |" or "| |" or "| ;" or "= ;"
isn't exactly readable, it's easy to overlook or misread. Maybe
provide some sort of epsilon symbol?

-- Niklas Matthies

_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Etienne M. Gagnon :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Niklas,

See below.
> I think that it may still be okay to require that empty alternatives
> have to be made more explicit in some way. This would also benefit
> readers of the grammar. Stuff like "= |" or "| |" or "| ;" or "= ;"
> isn't exactly readable, it's easy to overlook or misread. Maybe
> provide some sort of epsilon symbol?
>  
If it can make you happy, in the lexer, this is the case in SableCC 4;
you have to write '' to get an empty string.

The problem with parsers is that empty alternatives are a standard idiom
used in almost every compiler book out there (for implementing lists
using recursion).

Of course, SableCC's best practices are to use multiplicity operators
(e.g. *, ?) instead of empty alternatives, but SableCC has to remain
appealing to new users that haven't yet learned about such practices.
Also, it is convenient to be able to easily rewrite an existing grammar
that doesn't use operators in SableCC's syntax. SableCC 4's name
requirements relaxation help much in that direction.

Your main incentive for disallowing empty alternatives seems to be
related to errors. SableCC 4 is not SableCC 3. It won't report errors at
"line 0 position 0"! A spurious empty alternative in the CST that has no
match in the AST will get a precise error message. As for conflicts
resulting from empty alternatives, they will clearly identify the
problem : the conflict has to involve a reduction of the empty
alternative (it can't be a shift)! Two empty alternatives in the
concrete grammar automatically cause a conflict. So, there you go:
errors related to empty alternatives will be easy to locate.

If empty alternatives were not as widely used, I would agree with you to
force users into using some keyword to identify them, but given the
current state of things, I think that it is best to allow them without
requiring any special syntax.

Etienne

--
Etienne M. Gagnon, Ph.D.
SableCC:                                            http://sablecc.org




_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

signature.asc (265 bytes) Download Attachment

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Pete Poulos :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, May 25, 2009 at 8:57 AM, Etienne M. Gagnon <egagnon@...> wrote:
> If empty alternatives were not as widely used, I would agree with you to
> force users into using some keyword to identify them, but given the
> current state of things, I think that it is best to allow them without
> requiring any special syntax.

What if you had a flag or option the user could turn on disallow empty
expansions for those who are interested in it?

Also, while I haven't used SableCC in a while now (I'm *eagerly*
waiting till sableCC4 comes out), whenever I create productions I tend
to put the expression separators at the beginning of the next line,
and I find it easier to make sure they are there since they line up
vertically; so I would have written the erroneous block above like so:

// A top-level definition is either a statement in package context, a
use statement, or a subroutine declaration (ie a method definition in
Java)
declaration {-> declaration}  =
      {use} kw_use fq_java_classname tok_semi  {-> New
declaration.use_decl(fq_java_classname)}
    |  {sub} kw_sub barename block  {-> New
declaration.function(barename, New
statement.block([block.statement]))}
//  |      {global} global_declaration
    ;


A few things about this, if you comment the last line out, it's still
safe.  If the beginning of the line is missing the "|" like the first
line, it's easy to spot.  Also, I find that it's easier to remember
not to put the "|" in front of the first line in this scenario than it
is to remember to remove it from the last line when the bars are at
the end of the line.

Cheers,
Pete Poulos

PS: can we get a rough estimate on when we can expect to see sableCC 4?

_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Pete Poulos :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> // A top-level definition is either a statement in package context, a
> use statement, or a subroutine declaration (ie a method definition in
> Java)
> declaration {-> declaration}  =
>      {use} kw_use fq_java_classname tok_semi  {-> New
> declaration.use_decl(fq_java_classname)}
>    |  {sub} kw_sub barename block  {-> New
> declaration.function(barename, New
> statement.block([block.statement]))}
> //  |      {global} global_declaration
>    ;
>

err, with the way my email browser reformats my message (not sure how
it looks on your end) this looks like crap.  Let me create a shorter
example:

expr =
     {add} expr + expr
   | {sub} expr - expr
   | {mul} expr * expr
   | {div} expr / expr
   | {mod} expr % expr
   ;

cheers,
Pete Poulos

_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Etienne M. Gagnon :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Pete,

See Below.

What if you had a flag or option the user could turn on disallow empty
expansions for those who are interested in it?
  

That won't answer the original request to allow for trailing "|".

declaration {-> declaration}  =
      {use} kw_use fq_java_classname tok_semi  {-> New
declaration.use_decl(fq_java_classname)}
    |  {sub} kw_sub barename block  {-> New
declaration.function(barename, New
statement.block([block.statement]))}
//  |      {global} global_declaration
    ;
  
Why not put the '=' on the next line, e.g.
some_production
  = x y z
  | a b c
  | d e f
  ;
But, again, this doesn't solve the problem of getting identical layout for all alternatives.

The only real solution is to use a grammar formatter. Such a thing could even add some standard comment in the body of an empty alternative to help users see them. A formatter could be written as a separate software, or even be better, written as an Eclipse plugin... I'll let you guys work on that. :)

PS: can we get a rough estimate on when we can expect to see sableCC 4?
  

Ah! The dreaded question... I am aiming for the beginning of Summer.

Etienne

-- 
Etienne M. Gagnon, Ph.D.
SableCC:                                            http://sablecc.org


_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

signature.asc (265 bytes) Download Attachment

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Niklas Matthies :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon 2009-05-25 at 10:57h, Etienne M. Gagnon wrote on sablecc-discussion:
:
> Your main incentive for disallowing empty alternatives seems to be
> related to errors.

No, my main incentive is to have rearrange-friendly syntax, because
I've grown tired of replacing '|'s with ';'s and vice versa all the
time when designing and testing grammars. Also, just to be clear, I
don't want to disallow empty alternatives (= alternatives representing
the empty token sequence), but I'd want to disallow representing them
with an empty SableCC expression. It seems to me that empty
alternatives that have neither names nor AST transformation
annotations are not common enough in practice to justify impeding a
rearrange-friendly syntax for their sake. Preventing unintended or
reader-unfriendly empty alternatives would just be a side benefit.

In books and papers usually an epsilon is used to denote an empty
alternative in production rules. In informal textual communication,
one of "epsilon", "\epsilon" or "e" is often used. To be honest, I
don't remember ever having seen a production rule being defined as

    A -> aBA |

in print rather than by

    A -> aBA | \epsilon

There also have been parser generators that provide and/or require an
epsilon constant in productions. Therefore I think that it isn't
entirely unreasonable, or would do much harm, to disallow representing
empty alternatives with an entirely empty expression in SableCC.

A different way to achieve rearrange-friendly syntax would be to have
a Not-an-Alternative symbol, say $$, so one could write:

    stuff =
        foo |
        bar |
        baz |
        $$;

...but that'd be quite a hack.

-- Niklas Matthies

_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Niklas Matthies :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon 2009-05-25 at 12:23h, Etienne M. Gagnon wrote on sablecc-discussion:
:
> Why not put the '=' on the next line, e.g.
>
> some_production
>   = x y z
>   | a b c
>   | d e f
>   ;

Personally I use:

    some_production       {-> AST stuff}  =
        {name1}   x y z   {-> AST stuff}  |
        {name2}   a b c   {-> AST stuff}  |
        {name3}   d e f   {-> AST stuff}  ;

There's already so much going on, better put the uninteresting
punctuation at the end. ;)

-- Niklas Matthies

_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Etienne M. Gagnon :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


In books and papers usually an epsilon is used to denote an empty
alternative in production rules. In informal textual communication,
one of "epsilon", "\epsilon" or "e" is often used. To be honest, I
don't remember ever having seen a production rule being defined as

    A -> aBA |

in print rather than by

    A -> aBA | \epsilon
  
Right. But imagine the following situation. A new user comes in and writes:
  opt_ident =
    ident |
    /* empty */ ;

Of course, as usual, SableCC would ignore the comment. Now, with your proposal, SableCC would ignore the trailing "|" and thus treat this production as a single alternative one.

Worse: Maybe both the CST and AST use the same layout. So, it will get past through all checks. It's only later, at usage time, that end-users of the parser will notice that the ident is not optional.

It is easy to add an Empty keyword. It's more difficult to keep users from harming themselves by forgetting to write it explicitly.

Etienne

-- 
Etienne M. Gagnon, Ph.D.
SableCC:                                            http://sablecc.org


_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

signature.asc (265 bytes) Download Attachment

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Pete Poulos :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Personally I use:
>
>    some_production       {-> AST stuff}  =
>        {name1}   x y z   {-> AST stuff}  |
>        {name2}   a b c   {-> AST stuff}  |
>        {name3}   d e f   {-> AST stuff}  ;

I was just sharing my approach because I feel that it is easier to
rearrange the statements without causing the problems outlined in this
thread.  I also think that, once you're used to this approach, it's
easy to ignore the '|' at the start since it is essentially a straight
line running down the left side of the text (unless there's a problem,
in which case the error stands out as a break in that line and becomes
noticeable). I'm sure there are valid arguments for either approach,
just as everyone has thier preference for where to place curly braces
{ } in programming code.

Cheers,
Pete

_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Niklas Matthies :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon 2009-05-25 at 13:21h, Etienne M. Gagnon wrote on sablecc-discussion:

>
> > In books and papers usually an epsilon is used to denote an empty
> > alternative in production rules. In informal textual communication,
> > one of "epsilon", "\epsilon" or "e" is often used. To be honest, I
> > don't remember ever having seen a production rule being defined as
> >
> >     A -> aBA |
> >
> > in print rather than by
> >
> >     A -> aBA | \epsilon
>
> Right. But imagine the following situation. A new user comes in and writes:
>
>   opt_ident =
>     ident |
>     /* empty */ ;

Of course the comment here is a symptom of the fact that
empty-expression alternatives are non-intuitive: you have to reassure
the reader that you really mean "empty token sequence" here. This
legacy is unfortunate. But I'm just ranting. :)

> Of course, as usual, SableCC would ignore the comment. Now, with your
> proposal, SableCC would ignore the trailing "|" and thus treat this
> production as a single alternative one.
>
> Worse: Maybe both the CST and AST use the same layout. So, it will get
> past through all checks. It's only later, at usage time, that end-users
> of the parser will notice that the ident is not optional.

I'm semi-convinced. Of course silent errors can happen all the time,
like an accidental '||' instead of '|', or a forgotten '?', or indeed
writing '(a|b|c|)' instead of '(a|b|c)' in a regex (which is something
that does happen[1] -- quite an amusing read). But the above could
really be hard to spot for someone who's been accustomed to that
pattern. I don't know.

-- Niklas Matthies

[1] http://disordered.org/Content.cgi/dumbth/0569.html


_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Etienne M. Gagnon :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Niklas,

See below.
> I'm semi-convinced. Of course silent errors can happen all the time,
> like an accidental '||' instead of '|', or a forgotten '?', or indeed
> writing '(a|b|c|)' instead of '(a|b|c)' in a regex (which is something
> that does happen[1] -- quite an amusing read). But the above could
> really be hard to spot for someone who's been accustomed to that
> pattern. I don't know.
>  
There is a huge difference in the impact of an erroneous trailing "|"
left by error in the parser (leading to a spurious empty alternative),
and the same thing in a lexer.

For one thing, the parser needs an unambiguous parse. This restricts the
use of empty alternatives. The good thing is that empty alternatives are
quite good at generating conflicts. Second, as SableCC constructs an
AST, there will be a chance to notice the spurious construct in the
generated AST & tree walker classes (e.g. visitors).

As for lexers, they don't care about ambiguity; they only care about
matching. This means that once something is matched, no knowledge is
kept as how it actually matched. e.g.

word = letter* | 'thisisaword' | 'thisisanotherword';

The "word" token will match the string "thisisaword". Now, whether it
matches it as a "letter*" or as "'thisisaword'" is not known. But,
SableCC 4 does protect users from spurious "|" in lexers:

some_token = 'a' | 'b' |;  // error!

Instead, you have to write:

some_token = 'a' | 'b' | '';

This is quite acceptable, as a lexer is string based. So, it makes sense
to represent epsilon as an empty string.

Now, unfortunately, we can't extend this to the parser grammar:

Parser
 some_prod =
   '' |  // alternative with one token element
   ;     // empty alternative

As SableCC 4 allows for literal tokens in the parser grammar, the above
two alternatives are really distinct.

If there was an intuitive solution that wouldn't result in "surprises"
for new users that DON'T READ THE ******* DOCUMENTATION, as usual
(!!!!), then I would accept it. But, I haven't found one, and your
proposal would be surprising to new users. Really. I'm all in favor of
"no surprises". I leave other parser generators to surprise grammar
writers or (more likely) parser end-users with hidden side effects on
conflict resolution and other "features".

But, thanks for the neat suggestion. :)

Etienne

--
Etienne M. Gagnon, Ph.D.
SableCC:                                            http://sablecc.org




_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

signature.asc (265 bytes) Download Attachment

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Etienne M. Gagnon :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Actually, a solution would be to use an "alternative terminator" instead of a separator. Something like:
some_prod =
  {alt} some alt elements ;  // ";" being an alternative terminator
  {other} other elements ;
  ; // empty unnamed alternative
  a b c ; // unnamed alternative
  ;; // ";;" being the production terminator

With such a proposal, we get identical layout for all alternatives, but I simply don't like it. As for considering "|" as a terminator, I won't buy it. We have to deal with the legacy of hordes of programmers knowing "|" as a binary operator (e.g. separator).

OK, let's kill this thread, unless somebody has a really distinct proposal that is likely to be accepted.

Etienne


-- 
Etienne M. Gagnon, Ph.D.
SableCC:                                            http://sablecc.org


_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

signature.asc (265 bytes) Download Attachment

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Niklas Matthies :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon 2009-05-25 at 16:31h, Etienne M. Gagnon wrote on sablecc-discussion:
> Actually, a solution would be to use an "alternative terminator" instead
> of a separator. Something like:
>
> some_prod =
>   {alt} some alt elements ;  // ";" being an alternative terminator
>   {other} other elements ;
>   ; // empty unnamed alternative
>   a b c ; // unnamed alternative
>   ;; // ";;" being the production terminator

Yeah, I was actually thinking about having a syntax like

    some_prod   // no '='
        -> {foo} ...
        -> {bar} ...
        -> {baz} ...
        ;   // end of production`

or

    some_prod   // no '='
        {foo} -> ...
        {bar} -> ...
        {baz} -> ...
        ;   // end of production`

loosely based on the usual production rule syntax from academics,
and also a little bit inspired by the syntax of functions on algebraic
types from pattern-matching strongly typed functional languages. But
it's probably too much a departure from BNF, and providing it as an
additional alternative syntax would be bad.

> With such a proposal, we get identical layout for all alternatives, but
> I simply don't like it. As for considering "|" as a terminator, I won't
> buy it. We have to deal with the legacy of hordes of programmers knowing
> "|" as a binary operator (e.g. separator).

Yes, it works particularly badly for '|'.

Another thing I try to have in languages is to have a neutral element
for everything. (For example it's inconvenient how there's no "empty
set" syntax and no true/false constants in SQL.) So it would be nice
to have a neutral element N for productions, such that "N | x" is
equivalent to "x" for all x. (Effectively this is the same as the "$$"
sugested earlier.) But I guess that only automated grammar generators
and people like me would really make use of this. :)

-- Niklas Matthies

_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion

Re: error: alternative doesn't exist in section AST (even though it looks like it does)

by Niklas Matthies :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

[This message seems to have gotten stuck in my outbox yesterday.]

On Mon 2009-05-25 at 11:23h, Pete Poulos wrote on sablecc-discussion:
:
> I was just sharing my approach because I feel that it is easier to
> rearrange the statements without causing the problems outlined in this
> thread.  I also think that, once you're used to this approach, it's
> easy to ignore the '|' at the start since it is essentially a straight
> line running down the left side of the text (unless there's a problem,
> in which case the error stands out as a break in that line and becomes
> noticeable).

Yes, thanks for that suggestion, I may give this a try (again) sometime.
The issues I had with this approach when I considered it earlier were:

- If you want just one space character after the '|', then you can't
  just use normal Tab indentation for both the first alternative
  (which doesn't have a '|') and the others.

- When you press Return after the first alternative, "smart indent"
  behavior places the cursor too far to the right to immediately start
  typing the second alternative. You have to backspace for the '| '.
  And then "smart backspace" actually un-tabs, which may go too far to
  the left again (unless you align the '|'s with the production name).

- Productions with only a few alternatives are quite common, at least
  in my grammars, so the ratio of first alternatives is significant.

- The fact that an "other" alternative is usually placed at the end
  ("other" as in "mul_expr = {mul} ... | {div} ... | {other} ...")
  somewhat works against making the first alternative the special case
  as opposed to the last alternative.

- The '|'s are just added visual noise a the start of lines. For the
  human reader, indentation (or just the name annotations) is usually
  sufficient.

- If you align the production's AST annotation with the alternatives'
  AST annotations, then you get a "spurious" '=' behind the former.
  Of course one could wrap it to the line of the first alternative,
  aligning it with the '|'s. But associating the '=' with the first
  alternative also looks slightly wrong.

- As the ';' now gets its own line, you end up with one more line per
  production (less productions fit on the screen), and almost double
  line spacing between productions.

Given these, the other approach appeared more attractive. :)

-- Niklas Matthies

_______________________________________________
SableCC-Discussion mailing list
SableCC-Discussion@...
http://lists.sablecc.org/listinfo/sablecc-discussion
< Prev | 1 - 2 | Next >