Fix %error-verbose for conflicts resolved by %nonassoc.

View: New views
16 Messages — Rating Filter:   Alert me  

Fix %error-verbose for conflicts resolved by %nonassoc.

by Joel E. Denny-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Verbose syntax error messages used to include error action tokens in the
expected token list.  There was a FIXME in conflicts.at about this.  To
fix it, I pushed the following patches to master and similar ones to
branch-2.5.  I feel this area is a bit too subtle to include this change
in branch-2.4.2.

I'm not proud of propagating more preprocessor macros to yacc.c.  Maybe
this is a good place for some more functions like the ones Akim has
discussed recently.  Also, we could probably find better names than NINF.  
For example, error_action or default_action.

Anyway, for now, I mainly wanted to fix a bug.

>From 87412882128fc3ae807f47db23884552f5841e74 Mon Sep 17 00:00:00 2001
From: Joel E. Denny <jdenny@...>
Date: Tue, 25 Aug 2009 01:12:37 -0400
Subject: [PATCH] Some code and documentation improvements.

* data/c.m4 (b4_table_value_equals): New macro to capture
some repeated code.
* data/glr.c (yyis_pact_ninf): Use it here.
(yyis_table_ninf): Likewise.
(yyreportSyntaxError): Improve internal comments.
* data/yacc.c (yyis_pact_ninf): New macro copied from glr.c.
Use it everywhere possible.
(yyis_table_ninf): Likewise.
(yysyntax_error): Improve internal comments.
* data/lalr1.cc (yysyntax_error_): Likewise.
* data/lalr1.java (yysyntax_error): Likewise.
* src/tables.h: Improve comments about yypact, yytable, etc.
---
 ChangeLog       |   16 ++++++++++++++++
 data/c.m4       |   10 ++++++++++
 data/glr.c      |   11 ++++-------
 data/lalr1.cc   |    3 ++-
 data/lalr1.java |    3 ++-
 data/yacc.c     |   15 +++++++++++----
 src/tables.h    |   13 +++++++++++--
 7 files changed, 56 insertions(+), 15 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index b831d11..dc11d51 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,19 @@
+2009-08-25  Joel E. Denny  <jdenny@...>
+
+ Some code and documentation improvements.
+ * data/c.m4 (b4_table_value_equals): New macro to capture
+ some repeated code.
+ * data/glr.c (yyis_pact_ninf): Use it here.
+ (yyis_table_ninf): Likewise.
+ (yyreportSyntaxError): Improve internal comments.
+ * data/yacc.c (yyis_pact_ninf): New macro copied from glr.c.
+ Use it everywhere possible.
+ (yyis_table_ninf): Likewise.
+ (yysyntax_error): Improve internal comments.
+ * data/lalr1.cc (yysyntax_error_): Likewise.
+ * data/lalr1.java (yysyntax_error): Likewise.
+ * src/tables.h: Improve comments about yypact, yytable, etc.
+
 2009-08-21  Joel E. Denny  <jdenny@...>
 
  Use locale when quoting.
diff --git a/data/c.m4 b/data/c.m4
index 33a4186..3ba48db 100644
--- a/data/c.m4
+++ b/data/c.m4
@@ -157,6 +157,16 @@ m4_define([b4_int_type_for],
 [b4_int_type($1_min, $1_max)])
 
 
+# b4_table_value_equals(TABLE, VALUE, LITERAL)
+# --------------------------------------------
+# Without inducing a comparison warning from the compiler, check if the
+# literal value LITERAL equals VALUE from table TABLE, which must have
+# TABLE_min and TABLE_max defined.
+m4_define([b4_table_value_equals],
+[m4_if(m4_eval($3 < m4_indir([b4_]$1[_min])
+               || m4_indir([b4_]$1[_max]) < $3), [1],
+       [[YYID (0)]],
+       [[((]$2[) == (]$3[))]])])
 
 ## ---------##
 ## Values.  ##
diff --git a/data/glr.c b/data/glr.c
index a4b921f..b0ad100 100644
--- a/data/glr.c
+++ b/data/glr.c
@@ -954,9 +954,7 @@ yylhsNonterm (yyRuleNum yyrule)
 }
 
 #define yyis_pact_ninf(yystate) \
-  ]m4_if(m4_eval(b4_pact_ninf < b4_pact_min), [1],
- [0],
- [((yystate) == YYPACT_NINF)])[
+  ]b4_table_value_equals([[pact]], [[yystate]], [b4_pact_ninf])[
 
 /** True iff LR state STATE has only a default reduction (regardless
  *  of token).  */
@@ -974,9 +972,7 @@ yydefaultAction (yyStateNum yystate)
 }
 
 #define yyis_table_ninf(yytable_value) \
-  ]m4_if(m4_eval(b4_table_ninf < b4_table_min), [1],
- [YYID (0)],
- [((yytable_value) == YYTABLE_NINF)])[
+  ]b4_table_value_equals([[table]], [[yytable_value]], [b4_table_ninf])[
 
 /** Set *YYACTION to the action to take in YYSTATE on seeing YYTOKEN.
  *  Result R means
@@ -2048,7 +2044,8 @@ yyreportSyntaxError (yyGLRStack* yystackp]b4_user_formals[)
   char const *yyarg[YYERROR_VERBOSE_ARGS_MAXIMUM];
 
   /* Start YYX at -YYN if negative to avoid negative indexes in
-     YYCHECK.  */
+     YYCHECK.  In other words, skip the first -YYN actions for this
+     state because they are default actions.  */
   int yyxbegin = yyn < 0 ? -yyn : 0;
 
   /* Stay within bounds of both yycheck and yytname.  */
diff --git a/data/lalr1.cc b/data/lalr1.cc
index 2a9316d..5ce0b47 100644
--- a/data/lalr1.cc
+++ b/data/lalr1.cc
@@ -954,7 +954,8 @@ b4_error_verbose_if([state_type yystate, int yytoken],
     if (yypact_ninf_ < yyn && yyn <= yylast_)
       {
  /* Start YYX at -YYN if negative to avoid negative indexes in
-   YYCHECK.  */
+   YYCHECK.  In other words, skip the first -YYN actions for this
+   state because they are default actions.  */
  int yyxbegin = yyn < 0 ? -yyn : 0;
 
  /* Stay within bounds of both yycheck and yytname.  */
diff --git a/data/lalr1.java b/data/lalr1.java
index e68650d..7e274d6 100644
--- a/data/lalr1.java
+++ b/data/lalr1.java
@@ -735,7 +735,8 @@ m4_popdef([b4_at_dollar])])dnl
             StringBuffer res;
 
             /* Start YYX at -YYN if negative to avoid negative indexes in
-               YYCHECK.  */
+               YYCHECK.  In other words, skip the first -YYN actions for this
+               state because they are default actions.  */
             int yyxbegin = yyn < 0 ? -yyn : 0;
 
             /* Stay within bounds of both yycheck and yytname.  */
diff --git a/data/yacc.c b/data/yacc.c
index fa14cc5..9689223 100644
--- a/data/yacc.c
+++ b/data/yacc.c
@@ -525,8 +525,14 @@ static const ]b4_int_type_for([b4_toknum])[ yytoknum[] =
 
 #define YYPACT_NINF ]b4_pact_ninf[
 
+#define yyis_pact_ninf(yystate) \
+  ]b4_table_value_equals([[pact]], [[yystate]], [b4_pact_ninf])[
+
 #define YYTABLE_NINF ]b4_table_ninf[
 
+#define yyis_table_ninf(yytable_value) \
+  ]b4_table_value_equals([[table]], [[yytable_value]], [b4_table_ninf])[
+
 ]b4_parser_tables_define[
 
 #define yyerrok (yyerrstatus = 0)
@@ -848,7 +854,8 @@ yysyntax_error (char *yyresult, int yystate, int yytoken)
       char const *yyarg[YYERROR_VERBOSE_ARGS_MAXIMUM];
 
       /* Start YYX at -YYN if negative to avoid negative indexes in
- YYCHECK.  */
+ YYCHECK.  In other words, skip the first -YYN actions for this
+ state because they are default actions.  */
       int yyxbegin = yyn < 0 ? -yyn : 0;
 
       /* Stay within bounds of both yycheck and yytname.  */
@@ -1271,7 +1278,7 @@ yybackup:
 
   /* First try to decide what to do without reference to lookahead token.  */
   yyn = yypact[yystate];
-  if (yyn == YYPACT_NINF)
+  if (yyis_pact_ninf (yyn))
     goto yydefault;
 
   /* Not known => get a lookahead token if don't already have one.  */
@@ -1321,7 +1328,7 @@ yyread_pushed_token:]])[
   yyn = yytable[yyn];
   if (yyn <= 0)
     {
-      if (yyn == 0 || yyn == YYTABLE_NINF)
+      if (yyn == 0 || yyis_table_ninf (yyn))
  goto yyerrlab;
       yyn = -yyn;
       goto yyreduce;
@@ -1505,7 +1512,7 @@ yyerrlab1:
   for (;;)
     {
       yyn = yypact[yystate];
-      if (yyn != YYPACT_NINF)
+      if (!yyis_pact_ninf (yyn))
  {
   yyn += YYTERROR;
   if (0 <= yyn && yyn <= YYLAST && yycheck[yyn] == YYTERROR)
diff --git a/src/tables.h b/src/tables.h
index a44cdb8..b21fa7b 100644
--- a/src/tables.h
+++ b/src/tables.h
@@ -54,8 +54,15 @@
    something else to do.
 
    YYPACT[S] = index in YYTABLE of the portion describing state S.
-   The lookahead token's type is used to index that portion to find
-   out what to do.
+   The lookahead token's number, I, is used to index that portion of
+   YYTABLE to find out what action to perform.
+
+   If YYPACT[S] == YYPACT_NINF, if YYPACT[S] + I is outside the bounds
+   of YYTABLE (from 0 to YYLAST), or if YYCHECK indicates that I is
+   outside the bounds of the portion for S, then the default action
+   (from YYDEFACT and YYDEFGOTO) should be used instead of YYTABLE.
+   Otherwise, the value YYTABLE[YYPACT[S] + I] should be used even if
+   YYPACT[S] < 0.
 
    If the value in YYTABLE is positive, we shift the token and go to
    that state.
@@ -64,6 +71,8 @@
 
    If the value is zero, the default action from YYDEFACT[S] is used.
 
+   If the value is YYTABLE_NINF, it's a syntax error.
+
    YYPGOTO[I] = the index in YYTABLE of the portion describing what to
    do after reducing a rule that derives variable I + NTOKENS.  This
    portion is indexed by the parser state number, S, as of before the
--
1.5.4.3


>From 53f036ce027289d3f5e70c88735b88aa6725381d Mon Sep 17 00:00:00 2001
From: Joel E. Denny <jdenny@...>
Date: Tue, 25 Aug 2009 01:13:02 -0400
Subject: [PATCH] Fix %error-verbose for conflicts resolved by %nonassoc.

* NEWS (2.5): Document.
* data/glr.c (yyreportSyntaxError): Fix this by checking
yyis_table_ninf.
* data/yacc.c (yysyntax_error): Likewise.
* data/lalr1.cc (yysyntax_error_): Fix this by checking
yytable_ninf_.
* data/lalr1.java (yysyntax_error): Likewise.
* tests/conflicts.at (%nonassoc and eof): Update expected output
and remove FIXME.
---
 ChangeLog          |   13 +
 NEWS               |    9 +
 data/glr.c         |    3 +-
 data/lalr1.cc      |    3 +-
 data/lalr1.java    |    6 +-
 data/yacc.c        |    3 +-
 src/parse-gram.c   |  652 ++++++++++++++++++++++++++--------------------------
 src/parse-gram.h   |    8 +-
 tests/conflicts.at |    9 +-
 9 files changed, 369 insertions(+), 337 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index dc11d51..6624a0c 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,18 @@
 2009-08-25  Joel E. Denny  <jdenny@...>
 
+ Fix %error-verbose for conflicts resolved by %nonassoc.
+ * NEWS (2.5): Document.
+ * data/glr.c (yyreportSyntaxError): Fix this by checking
+ yyis_table_ninf.
+ * data/yacc.c (yysyntax_error): Likewise.
+ * data/lalr1.cc (yysyntax_error_): Fix this by checking
+ yytable_ninf_.
+ * data/lalr1.java (yysyntax_error): Likewise.
+ * tests/conflicts.at (%nonassoc and eof): Update expected output
+ and remove FIXME.
+
+2009-08-25  Joel E. Denny  <jdenny@...>
+
  Some code and documentation improvements.
  * data/c.m4 (b4_table_value_equals): New macro to capture
  some repeated code.
diff --git a/NEWS b/NEWS
index 54a10d8..7a5009c 100644
--- a/NEWS
+++ b/NEWS
@@ -146,6 +146,15 @@ Bison News
   Bison now warns when a character literal is not of length one.  In
   some future release, Bison will report an error instead.
 
+** Verbose error messages fixed for nonassociative tokens.
+
+  When %error-verbose is specified, syntax error messages produced by
+  the generated parser include the unexpected token as well as a list of
+  expected tokens.  Previously, this list erroneously included tokens
+  that would actually induce a syntax error because conflicts for them
+  were resolved with %nonassoc.  Such tokens are now properly omitted
+  from the list.
+
 * Changes in version 2.4.2 (????-??-??):
 
 ** Detection of GNU M4 1.4.6 or newer during configure is improved.
diff --git a/data/glr.c b/data/glr.c
index b0ad100..85a5aff 100644
--- a/data/glr.c
+++ b/data/glr.c
@@ -2060,7 +2060,8 @@ yyreportSyntaxError (yyGLRStack* yystackp]b4_user_formals[)
   yyarg[yycount++] = yytokenName (yytoken);
 
   for (yyx = yyxbegin; yyx < yyxend; ++yyx)
-    if (yycheck[yyx + yyn] == yyx && yyx != YYTERROR)
+    if (yycheck[yyx + yyn] == yyx && yyx != YYTERROR
+ && !yyis_table_ninf (yytable[yyx + yyn]))
       {
  if (yycount == YYERROR_VERBOSE_ARGS_MAXIMUM)
   {
diff --git a/data/lalr1.cc b/data/lalr1.cc
index 5ce0b47..1e4c856 100644
--- a/data/lalr1.cc
+++ b/data/lalr1.cc
@@ -971,7 +971,8 @@ b4_error_verbose_if([state_type yystate, int yytoken],
         char const *yyarg[YYERROR_VERBOSE_ARGS_MAXIMUM];
         yyarg[yycount++] = yytname_[yytoken];
  for (int yyx = yyxbegin; yyx < yyxend; ++yyx)
-  if (yycheck_[yyx + yyn] == yyx && yyx != yyterror_)
+  if (yycheck_[yyx + yyn] == yyx && yyx != yyterror_
+      && yytable_[yyx + yyn] != yytable_ninf_)
           {
             if (yycount == YYERROR_VERBOSE_ARGS_MAXIMUM)
             {
diff --git a/data/lalr1.java b/data/lalr1.java
index 7e274d6..646b777 100644
--- a/data/lalr1.java
+++ b/data/lalr1.java
@@ -744,7 +744,8 @@ m4_popdef([b4_at_dollar])])dnl
             int yyxend = yychecklim < yyntokens_ ? yychecklim : yyntokens_;
             int count = 0;
             for (int x = yyxbegin; x < yyxend; ++x)
-              if (yycheck_[x + yyn] == x && x != yyterror_)
+              if (yycheck_[x + yyn] == x && x != yyterror_
+                  && yycheck_[x + yyn] != yytable_ninf_)
                 ++count;
 
             // FIXME: This method of building the message is not compatible
@@ -755,7 +756,8 @@ m4_popdef([b4_at_dollar])])dnl
               {
                 count = 0;
                 for (int x = yyxbegin; x < yyxend; ++x)
-                  if (yycheck_[x + yyn] == x && x != yyterror_)
+                  if (yycheck_[x + yyn] == x && x != yyterror_
+                      && yycheck_[x + yyn] != yytable_ninf_)
                     {
                       res.append (count++ == 0 ? ", expecting " : " or ");
                       res.append (yytnamerr_ (yytname_[x]));
diff --git a/data/yacc.c b/data/yacc.c
index 9689223..acc1118 100644
--- a/data/yacc.c
+++ b/data/yacc.c
@@ -869,7 +869,8 @@ yysyntax_error (char *yyresult, int yystate, int yytoken)
       yyarg[yycount++] = yytname[yytoken];
 
       for (yyx = yyxbegin; yyx < yyxend; ++yyx)
- if (yycheck[yyx + yyn] == yyx && yyx != YYTERROR)
+ if (yycheck[yyx + yyn] == yyx && yyx != YYTERROR
+    && !yyis_table_ninf (yytable[yyx + yyn]))
   {
     if (yycount == YYERROR_VERBOSE_ARGS_MAXIMUM)
       {
diff --git a/tests/conflicts.at b/tests/conflicts.at
index 98b7d41..f2f7861 100644
--- a/tests/conflicts.at
+++ b/tests/conflicts.at
@@ -99,20 +99,17 @@ AT_BISON_CHECK([-o input.c input.y])
 AT_COMPILE([input])
 
 AT_PARSER_CHECK([./input '0<0'])
-# FIXME: This is an actual bug, but a new one, in the sense that
-# no one has ever spotted it!  The messages are *wrong*: there should
-# be nothing there, it should be expected eof.
 AT_PARSER_CHECK([./input '0<0<0'], [1], [],
-         [syntax error, unexpected '<', expecting '<' or '>'
+         [syntax error, unexpected '<'
 ])
 
 AT_PARSER_CHECK([./input '0>0'])
 AT_PARSER_CHECK([./input '0>0>0'], [1], [],
-         [syntax error, unexpected '>', expecting '<' or '>'
+         [syntax error, unexpected '>'
 ])
 
 AT_PARSER_CHECK([./input '0<0>0'], [1], [],
-         [syntax error, unexpected '>', expecting '<' or '>'
+         [syntax error, unexpected '>'
 ])
 
 AT_CLEANUP
--
1.5.4.3




Re: Fix %error-verbose for conflicts resolved by %nonassoc.

by Joel E. Denny-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 25 Aug 2009, Joel E. Denny wrote:

> I feel this area is a bit too subtle to include this change
> in branch-2.4.2.

At least I got that right.

As for the rest, the following patch hopefully fixes it.  I pushed it to
master and a similar patch to branch-2.5.

> I'm not proud of propagating more preprocessor macros to yacc.c.  Maybe
> this is a good place for some more functions like the ones Akim has
> discussed recently.  Also, we could probably find better names than NINF.  
> For example, error_action or default_action.

They're still macros in yacc.c, but I've chosen better names, and I've
added corresponding functions in lalr1.cc and lalr1.java.

>From f2b30bdf3713e6fa9fafd0fc6caed68e38248ebc Mon Sep 17 00:00:00 2001
From: Joel E. Denny <jdenny@...>
Date: Tue, 25 Aug 2009 19:41:49 -0400
Subject: [PATCH] More fixes related to last two patches.

* data/bison.m4 (b4_integral_parser_tables_map): Fix YYTABLE
comments: zero indicates syntax error not default action.
* data/c.m4 (b4_table_value_equals): Comment that YYID must be
defined.
* data/glr.c (yyis_pact_ninf): Rename to...
(yypact_value_is_default): ... this.
(yyisDefaultedState): Update for rename.
(yyis_table_ninf): Rename to...
(yytable_value_is_error): ... this, and check for value zero
besides just YYTABLE_NINF.
(yygetLRActions): Check for default value from yypact.  It
appears that this check is always performed before this function
is invoked, and so adding the check here is probably redundant.
However, the code may evolve after this subtlety is forgotten.
Also, update for rename to yytable_value_is_error.  Because that
macro now checks for zero, a different but equivalent branch of
the if-then-else here is evaluated.
(yyreportSyntaxError): Update for rename to
yytable_value_is_error.  The zero condition was mishandled
before.
(yyrecoverSyntaxError): Update for renames.  No behavioral
changes.
* data/lalr1.cc, data/lalr1.java (yy_pact_value_is_default_):
New function.
(yy_table_value_is_error_): New function.
(parse): Use new functions where possible.  No behavioral
changes.
(yysyntax_error_, yysyntax_error): Use yy_table_value_is_error_.
The zero condition was mishandled before.
* data/yacc.c (yyis_pact_ninf): Rename to...
(yypact_value_is_default): ... this.
(yyis_table_ninf): Rename to...
(yytable_value_is_error): ... this, and check for value zero
besides just YYTABLE_NINF.
(yysyntax_error): Update for rename to yytable_value_is_error.
The zero condition was mishandled before.
(yyparse): Update for renames.  No behavioral changes.
* src/tables.h: Improve comments about yypact, yytable, etc.
more.  Most importantly, say yytable value of zero means syntax
error not default action.
---
 ChangeLog        |   44 ++++++
 data/bison.m4    |    3 +-
 data/c.m4        |    3 +-
 data/glr.c       |   22 ++--
 data/lalr1.cc    |   28 +++-
 data/lalr1.java  |   28 +++-
 data/yacc.c      |   15 +-
 src/parse-gram.c |  438 +++++++++++++++++++++++++++---------------------------
 src/parse-gram.h |    6 +-
 src/tables.h     |   58 ++++----
 10 files changed, 367 insertions(+), 278 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 6624a0c..93a7499 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,49 @@
 2009-08-25  Joel E. Denny  <jdenny@...>
 
+ More fixes related to last two patches.
+ * data/bison.m4 (b4_integral_parser_tables_map): Fix YYTABLE
+ comments: zero indicates syntax error not default action.
+ * data/c.m4 (b4_table_value_equals): Comment that YYID must be
+ defined.
+ * data/glr.c (yyis_pact_ninf): Rename to...
+ (yypact_value_is_default): ... this.
+ (yyisDefaultedState): Update for rename.
+ (yyis_table_ninf): Rename to...
+ (yytable_value_is_error): ... this, and check for value zero
+ besides just YYTABLE_NINF.
+ (yygetLRActions): Check for default value from yypact.  It
+ appears that this check is always performed before this function
+ is invoked, and so adding the check here is probably redundant.
+ However, the code may evolve after this subtlety is forgotten.
+ Also, update for rename to yytable_value_is_error.  Because that
+ macro now checks for zero, a different but equivalent branch of
+ the if-then-else here is evaluated.
+ (yyreportSyntaxError): Update for rename to
+ yytable_value_is_error.  The zero condition was mishandled
+ before.
+ (yyrecoverSyntaxError): Update for renames.  No behavioral
+ changes.
+ * data/lalr1.cc, data/lalr1.java (yy_pact_value_is_default_):
+ New function.
+ (yy_table_value_is_error_): New function.
+ (parse): Use new functions where possible.  No behavioral
+ changes.
+ (yysyntax_error_, yysyntax_error): Use yy_table_value_is_error_.
+ The zero condition was mishandled before.
+ * data/yacc.c (yyis_pact_ninf): Rename to...
+ (yypact_value_is_default): ... this.
+ (yyis_table_ninf): Rename to...
+ (yytable_value_is_error): ... this, and check for value zero
+ besides just YYTABLE_NINF.
+ (yysyntax_error): Update for rename to yytable_value_is_error.
+ The zero condition was mishandled before.
+ (yyparse): Update for renames.  No behavioral changes.
+ * src/tables.h: Improve comments about yypact, yytable, etc.
+ more.  Most importantly, say yytable value of zero means syntax
+ error not default action.
+
+2009-08-25  Joel E. Denny  <jdenny@...>
+
  Fix %error-verbose for conflicts resolved by %nonassoc.
  * NEWS (2.5): Document.
  * data/glr.c (yyreportSyntaxError): Fix this by checking
diff --git a/data/bison.m4 b/data/bison.m4
index 4ca6cee..ef8f778 100644
--- a/data/bison.m4
+++ b/data/bison.m4
@@ -266,8 +266,7 @@ $1([defgoto], [b4_defgoto], [[YYDEFGOTO[NTERM-NUM].]])
 $1([table], [b4_table],
    [[YYTABLE[YYPACT[STATE-NUM]].  What to do in state STATE-NUM.  If
 positive, shift that token.  If negative, reduce the rule which
-number is the opposite.  If zero, do what YYDEFACT says.
-If YYTABLE_NINF, syntax error.]])
+number is the opposite.  If zero or YYTABLE_NINF, syntax error.]])
 
 $1([check], [b4_check])
 
diff --git a/data/c.m4 b/data/c.m4
index 3ba48db..5cd404d 100644
--- a/data/c.m4
+++ b/data/c.m4
@@ -161,7 +161,8 @@ m4_define([b4_int_type_for],
 # --------------------------------------------
 # Without inducing a comparison warning from the compiler, check if the
 # literal value LITERAL equals VALUE from table TABLE, which must have
-# TABLE_min and TABLE_max defined.
+# TABLE_min and TABLE_max defined.  YYID must be defined as an identity
+# function that suppresses warnings about constant conditions.
 m4_define([b4_table_value_equals],
 [m4_if(m4_eval($3 < m4_indir([b4_]$1[_min])
                || m4_indir([b4_]$1[_max]) < $3), [1],
diff --git a/data/glr.c b/data/glr.c
index 85a5aff..8d37070 100644
--- a/data/glr.c
+++ b/data/glr.c
@@ -953,7 +953,7 @@ yylhsNonterm (yyRuleNum yyrule)
   return yyr1[yyrule];
 }
 
-#define yyis_pact_ninf(yystate) \
+#define yypact_value_is_default(yystate) \
   ]b4_table_value_equals([[pact]], [[yystate]], [b4_pact_ninf])[
 
 /** True iff LR state STATE has only a default reduction (regardless
@@ -961,7 +961,7 @@ yylhsNonterm (yyRuleNum yyrule)
 static inline yybool
 yyisDefaultedState (yyStateNum yystate)
 {
-  return yyis_pact_ninf (yypact[yystate]);
+  return yypact_value_is_default (yypact[yystate]);
 }
 
 /** The default reduction for STATE, assuming it has one.  */
@@ -971,8 +971,9 @@ yydefaultAction (yyStateNum yystate)
   return yydefact[yystate];
 }
 
-#define yyis_table_ninf(yytable_value) \
-  ]b4_table_value_equals([[table]], [[yytable_value]], [b4_table_ninf])[
+#define yytable_value_is_error(yytable_value) \
+  (]b4_table_value_equals([[table]], [[yytable_value]], [b4_table_ninf])[ \
+   || ]b4_table_value_equals([[table]], [[yytable_value]], [[0]])[)
 
 /** Set *YYACTION to the action to take in YYSTATE on seeing YYTOKEN.
  *  Result R means
@@ -987,12 +988,13 @@ yygetLRActions (yyStateNum yystate, int yytoken,
  int* yyaction, const short int** yyconflicts)
 {
   int yyindex = yypact[yystate] + yytoken;
-  if (yyindex < 0 || YYLAST < yyindex || yycheck[yyindex] != yytoken)
+  if (yypact_value_is_default (yypact[yystate])
+      || yyindex < 0 || YYLAST < yyindex || yycheck[yyindex] != yytoken)
     {
       *yyaction = -yydefact[yystate];
       *yyconflicts = yyconfl;
     }
-  else if (! yyis_table_ninf (yytable[yyindex]))
+  else if (! yytable_value_is_error (yytable[yyindex]))
     {
       *yyaction = yytable[yyindex];
       *yyconflicts = yyconfl + yyconflp[yyindex];
@@ -2061,7 +2063,7 @@ yyreportSyntaxError (yyGLRStack* yystackp]b4_user_formals[)
 
   for (yyx = yyxbegin; yyx < yyxend; ++yyx)
     if (yycheck[yyx + yyn] == yyx && yyx != YYTERROR
- && !yyis_table_ninf (yytable[yyx + yyn]))
+ && !yytable_value_is_error (yytable[yyx + yyn]))
       {
  if (yycount == YYERROR_VERBOSE_ARGS_MAXIMUM)
   {
@@ -2172,7 +2174,7 @@ yyrecoverSyntaxError (yyGLRStack* yystackp]b4_user_formals[)
     YY_SYMBOL_PRINT ("Next token is", yytoken, &yylval, &yylloc);
   }
  yyj = yypact[yystackp->yytops.yystates[0]->yylrState];
- if (yyis_pact_ninf (yyj))
+ if (yypact_value_is_default (yyj))
   return;
  yyj += yytoken;
  if (yyj < 0 || YYLAST < yyj || yycheck[yyj] != yytoken)
@@ -2180,7 +2182,7 @@ yyrecoverSyntaxError (yyGLRStack* yystackp]b4_user_formals[)
     if (yydefact[yystackp->yytops.yystates[0]->yylrState] != 0)
       return;
   }
- else if (yytable[yyj] != 0 && ! yyis_table_ninf (yytable[yyj]))
+ else if (! yytable_value_is_error (yytable[yyj]))
   return;
       }
 
@@ -2201,7 +2203,7 @@ yyrecoverSyntaxError (yyGLRStack* yystackp]b4_user_formals[)
     {
       yyGLRState *yys = yystackp->yytops.yystates[0];
       yyj = yypact[yys->yylrState];
-      if (! yyis_pact_ninf (yyj))
+      if (! yypact_value_is_default (yyj))
  {
   yyj += YYTERROR;
   if (0 <= yyj && yyj <= YYLAST && yycheck[yyj] == YYTERROR
diff --git a/data/lalr1.cc b/data/lalr1.cc
index 1e4c856..36c8d58 100644
--- a/data/lalr1.cc
+++ b/data/lalr1.cc
@@ -245,6 +245,14 @@ do {                                                            \
     /// \param yylhs     the nonterminal to push on the stack
     state_type yy_lr_goto_state_ (state_type yystate, int yylhs);
 
+    /// Whether the given \c yypact_ value indicates a defaulted state.
+    /// \param yyvalue   the value to check
+    static bool yy_pact_value_is_default_ (int yyvalue);
+
+    /// Whether the given \c yytable_ value indicates a syntax error.
+    /// \param yyvalue   the value to check
+    static bool yy_table_value_is_error_ (int yyvalue);
+
     /// Internal symbol numbers.
     typedef ]b4_int_type_for([b4_translate])[ token_number_type;
     static const ]b4_int_type(b4_pact_ninf, b4_pact_ninf)[ yypact_ninf_;
@@ -651,6 +659,18 @@ b4_percent_code_get[]dnl
       return yydefgoto_[yylhs - yyntokens_];
   }
 
+  inline bool
+  ]b4_parser_class_name[::yy_pact_value_is_default_ (int yyvalue)
+  {
+    return yyvalue == yypact_ninf_;
+  }
+
+  inline bool
+  ]b4_parser_class_name[::yy_table_value_is_error_ (int yyvalue)
+  {
+    return yyvalue == 0 || yyvalue == yytable_ninf_;
+  }
+
   int
   ]b4_parser_class_name[::parse ()
   {
@@ -709,7 +729,7 @@ m4_popdef([b4_at_dollar])])dnl
 
     /* Try to take a decision without lookahead.  */
     yyn = yypact_[yystack_[0].state];
-    if (yyn == yypact_ninf_)
+    if (yy_pact_value_is_default_ (yyn))
       goto yydefault;
 
     /* Read a lookahead token.  */
@@ -737,7 +757,7 @@ m4_ifdef([b4_lex_param], [, ]b4_lex_param)));])[
     yyn = yytable_[yyn];
     if (yyn <= 0)
       {
- if (yyn == 0 || yyn == yytable_ninf_)
+ if (yy_table_value_is_error_ (yyn))
   goto yyerrlab;
  yyn = -yyn;
  goto yyreduce;
@@ -887,7 +907,7 @@ m4_ifdef([b4_lex_param], [, ]b4_lex_param)));])[
       for (;;)
         {
           yyn = yypact_[yystack_[0].state];
-          if (yyn != yypact_ninf_)
+          if (!yy_pact_value_is_default_ (yyn))
             {
               yyn += yyterror_;
               if (0 <= yyn && yyn <= yylast_ && yycheck_[yyn] == yyterror_)
@@ -972,7 +992,7 @@ b4_error_verbose_if([state_type yystate, int yytoken],
         yyarg[yycount++] = yytname_[yytoken];
  for (int yyx = yyxbegin; yyx < yyxend; ++yyx)
   if (yycheck_[yyx + yyn] == yyx && yyx != yyterror_
-      && yytable_[yyx + yyn] != yytable_ninf_)
+      && !yy_table_value_is_error_ (yytable_[yyx + yyn]))
           {
             if (yycount == YYERROR_VERBOSE_ARGS_MAXIMUM)
             {
diff --git a/data/lalr1.java b/data/lalr1.java
index 646b777..5f0cfd7 100644
--- a/data/lalr1.java
+++ b/data/lalr1.java
@@ -533,7 +533,7 @@ m4_popdef([b4_at_dollar])])dnl
 
         /* Take a decision.  First try without lookahead.  */
         yyn = yypact_[yystate];
-        if (yyn == yypact_ninf_)
+        if (yy_pact_value_is_default_ (yyn))
           {
             label = YYDEFAULT;
             break;
@@ -572,7 +572,7 @@ m4_popdef([b4_at_dollar])])dnl
         /* <= 0 means reduce or error.  */
         else if ((yyn = yytable_[yyn]) <= 0)
           {
-            if (yyn == 0 || yyn == yytable_ninf_)
+            if (yy_table_value_is_error_ (yyn))
               label = YYFAIL;
             else
               {
@@ -676,7 +676,7 @@ m4_popdef([b4_at_dollar])])dnl
         for (;;)
           {
             yyn = yypact_[yystate];
-            if (yyn != yypact_ninf_)
+            if (!yy_pact_value_is_default_ (yyn))
               {
                 yyn += yyterror_;
                 if (0 <= yyn && yyn <= yylast_ && yycheck_[yyn] == yyterror_)
@@ -745,7 +745,7 @@ m4_popdef([b4_at_dollar])])dnl
             int count = 0;
             for (int x = yyxbegin; x < yyxend; ++x)
               if (yycheck_[x + yyn] == x && x != yyterror_
-                  && yycheck_[x + yyn] != yytable_ninf_)
+                  && !yy_table_value_is_error_ (yycheck_[x + yyn]))
                 ++count;
 
             // FIXME: This method of building the message is not compatible
@@ -757,7 +757,7 @@ m4_popdef([b4_at_dollar])])dnl
                 count = 0;
                 for (int x = yyxbegin; x < yyxend; ++x)
                   if (yycheck_[x + yyn] == x && x != yyterror_
-                      && yycheck_[x + yyn] != yytable_ninf_)
+                      && !yy_table_value_is_error_ (yycheck_[x + yyn]))
                     {
                       res.append (count++ == 0 ? ", expecting " : " or ");
                       res.append (yytnamerr_ (yytname_[x]));
@@ -770,6 +770,24 @@ m4_popdef([b4_at_dollar])])dnl
     return "syntax error";
   }
 
+  /**
+   * Whether the given <code>yypact_</code> value indicates a defaulted state.
+   * @@param yyvalue   the value to check
+   */
+  private static boolean yy_pact_value_is_default_ (int yyvalue)
+  {
+    return yyvalue == yypact_ninf_;
+  }
+
+  /**
+   * Whether the given <code>yytable_</code> value indicates a syntax error.
+   * @@param yyvalue   the value to check
+   */
+  private static boolean yy_table_value_is_error_ (int yyvalue)
+  {
+    return yyvalue == 0 || yyvalue == yytable_ninf_;
+  }
+
   private static final ]b4_int_type_for([b4_pact])[ yypact_ninf_ = ]b4_pact_ninf[;
   private static final ]b4_int_type_for([b4_table])[ yytable_ninf_ = ]b4_table_ninf[;
 
diff --git a/data/yacc.c b/data/yacc.c
index acc1118..505b09e 100644
--- a/data/yacc.c
+++ b/data/yacc.c
@@ -525,13 +525,14 @@ static const ]b4_int_type_for([b4_toknum])[ yytoknum[] =
 
 #define YYPACT_NINF ]b4_pact_ninf[
 
-#define yyis_pact_ninf(yystate) \
+#define yypact_value_is_default(yystate) \
   ]b4_table_value_equals([[pact]], [[yystate]], [b4_pact_ninf])[
 
 #define YYTABLE_NINF ]b4_table_ninf[
 
-#define yyis_table_ninf(yytable_value) \
-  ]b4_table_value_equals([[table]], [[yytable_value]], [b4_table_ninf])[
+#define yytable_value_is_error(yytable_value) \
+  (]b4_table_value_equals([[table]], [[yytable_value]], [b4_table_ninf])[ \
+  || ]b4_table_value_equals([[table]], [[yytable_value]], [[0]])[)
 
 ]b4_parser_tables_define[
 
@@ -870,7 +871,7 @@ yysyntax_error (char *yyresult, int yystate, int yytoken)
 
       for (yyx = yyxbegin; yyx < yyxend; ++yyx)
  if (yycheck[yyx + yyn] == yyx && yyx != YYTERROR
-    && !yyis_table_ninf (yytable[yyx + yyn]))
+    && !yytable_value_is_error (yytable[yyx + yyn]))
   {
     if (yycount == YYERROR_VERBOSE_ARGS_MAXIMUM)
       {
@@ -1279,7 +1280,7 @@ yybackup:
 
   /* First try to decide what to do without reference to lookahead token.  */
   yyn = yypact[yystate];
-  if (yyis_pact_ninf (yyn))
+  if (yypact_value_is_default (yyn))
     goto yydefault;
 
   /* Not known => get a lookahead token if don't already have one.  */
@@ -1329,7 +1330,7 @@ yyread_pushed_token:]])[
   yyn = yytable[yyn];
   if (yyn <= 0)
     {
-      if (yyn == 0 || yyis_table_ninf (yyn))
+      if (yytable_value_is_error (yyn))
  goto yyerrlab;
       yyn = -yyn;
       goto yyreduce;
@@ -1513,7 +1514,7 @@ yyerrlab1:
   for (;;)
     {
       yyn = yypact[yystate];
-      if (!yyis_pact_ninf (yyn))
+      if (!yypact_value_is_default (yyn))
  {
   yyn += YYTERROR;
   if (0 <= yyn && yyn <= YYLAST && yycheck[yyn] == YYTERROR)
diff --git a/src/tables.h b/src/tables.h
index b21fa7b..9b52f09 100644
--- a/src/tables.h
+++ b/src/tables.h
@@ -45,6 +45,24 @@
 
    YYSTOS[S] = the symbol number of the symbol that leads to state S.
 
+   YYFINAL = the state number of the termination state.
+
+   YYTABLE = a vector filled with portions for different uses, found
+   via YYPACT and YYPGOTO.
+
+   YYLAST ( = high) the number of the last element of YYTABLE, i.e.,
+   sizeof (YYTABLE) - 1.
+
+   YYCHECK = a vector indexed in parallel with YYTABLE.  It indicates,
+   in a roundabout way, the bounds of the portion you are trying to
+   examine.
+
+   Suppose that the portion of YYTABLE starts at index P and the index
+   to be examined within the portion is I.  Then if YYCHECK[P+I] != I,
+   I is outside the bounds of what is actually allocated, and the
+   default (from YYDEFACT or YYDEFGOTO) should be used.  Otherwise,
+   YYTABLE[P+I] should be used.
+
    YYDEFACT[S] = default reduction number in state s.  Performed when
    YYTABLE doesn't specify something else to do.  Zero means the default
    is an error.
@@ -58,10 +76,10 @@
    YYTABLE to find out what action to perform.
 
    If YYPACT[S] == YYPACT_NINF, if YYPACT[S] + I is outside the bounds
-   of YYTABLE (from 0 to YYLAST), or if YYCHECK indicates that I is
-   outside the bounds of the portion for S, then the default action
-   (from YYDEFACT and YYDEFGOTO) should be used instead of YYTABLE.
-   Otherwise, the value YYTABLE[YYPACT[S] + I] should be used even if
+   of YYTABLE (from 0 to YYLAST), or I is outside the bounds for portion
+   S (that is, YYCHECK[YYPACT[S] + I] != I), then the default action
+   (that is, YYDEFACT[S]) should be used instead of YYTABLE.  Otherwise,
+   the value YYTABLE[YYPACT[S] + I] should be used even if
    YYPACT[S] < 0.
 
    If the value in YYTABLE is positive, we shift the token and go to
@@ -69,33 +87,19 @@
 
    If the value is negative, it is minus a rule number to reduce by.
 
-   If the value is zero, the default action from YYDEFACT[S] is used.
-
-   If the value is YYTABLE_NINF, it's a syntax error.
+   If the value is zero or YYTABLE_NINF, it's a syntax error.
 
    YYPGOTO[I] = the index in YYTABLE of the portion describing what to
    do after reducing a rule that derives variable I + NTOKENS.  This
    portion is indexed by the parser state number, S, as of before the
-   text for this nonterminal was read.  The value from YYTABLE is the
-   state to go to if the corresponding value in YYCHECK is S.
-
-   YYTABLE = a vector filled with portions for different uses, found
-   via YYPACT and YYPGOTO.
-
-   YYCHECK = a vector indexed in parallel with YYTABLE.  It indicates,
-   in a roundabout way, the bounds of the portion you are trying to
-   examine.
-
-   Suppose that the portion of YYTABLE starts at index P and the index
-   to be examined within the portion is I.  Then if YYCHECK[P+I] != I,
-   I is outside the bounds of what is actually allocated, and the
-   default (from YYDEFACT or YYDEFGOTO) should be used.  Otherwise,
-   YYTABLE[P+I] should be used.
-
-   YYFINAL = the state number of the termination state.
-
-   YYLAST ( = high) the number of the last element of YYTABLE, i.e.,
-   sizeof (YYTABLE) - 1.  */
+   text for this nonterminal was read.
+
+   If YYPGOTO[I] + S is outside the bounds of YYTABLE (from 0 to YYLAST)
+   or if S is outside the bounds of the portion for I (that is,
+   YYCHECK[YYPGOTO[I] + S] != S), then the default state (that is,
+   YYDEFGOTO[I]) should be used instead of YYTABLE.  Otherwise,
+   YYTABLE[YYPGOTO[I] + S] is the state to go to even if YYPGOTO[I] < 0.
+*/
 
 extern int nvectors;
 
--
1.5.4.3




Re: Fix %error-verbose for conflicts resolved by %nonassoc.

by Joel E. Denny-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 25 Aug 2009, Joel E. Denny wrote:

> On Tue, 25 Aug 2009, Joel E. Denny wrote:
>
> > I feel this area is a bit too subtle to include this change
> > in branch-2.4.2.
>
> At least I got that right.
>
> As for the rest, the following patch hopefully fixes it.

Nope, so I pushed the patch below to master and a similar one to
branch-2.5.

My confusion has been as follows.  The comments in the source code said
that a zero in yytable indicates that a default action should be used.  
However, the code in the skeletons checked for zero as if it meant syntax
error instead.  In the previous patch, I just assumed that the comments
were out of date and that the skeletons were right because they work
correctly.  But that was wrong.  The issue is more subtle.

I finally decided to thoroughly investigate the code in tables.c to see
how the tables are actually being computed.  It turns out that the
comments were right but incomplete.  The zero check in the skeletons is
always false and thus misleading but harmless.  I tested this conclusion
by temporarily adding an assertion to yytable_value_is_error in all four
skeletons.  maintainer-check-valgrind still passed.

The patch below cleans up the documentation and the misleading code.

> diff --git a/data/lalr1.java b/data/lalr1.java

> -                  && yycheck_[x + yyn] != yytable_ninf_)
> +                  && !yy_table_value_is_error_ (yycheck_[x + yyn]))

> -                      && yycheck_[x + yyn] != yytable_ninf_)
> +                      && !yy_table_value_is_error_ (yycheck_[x + yyn]))

Ugh.  That's supposed to be yytable_ not yycheck_.  It's fixed below.

Sorry for all the noise.

>From aa0cb40d61cda5bfa9d325a45735439cbbd06327 Mon Sep 17 00:00:00 2001
From: Joel E. Denny <jdenny@...>
Date: Wed, 26 Aug 2009 02:40:38 -0400
Subject: [PATCH] Actually handle the yytable zero value correctly this time.

* data/bison.m4 (b4_integral_parser_tables_map): Don't mention
zero values in the YYTABLE comments.
* data/glr.c (yytable_value_is_error): Don't check for zero
value.
* data/lalr1.cc (yy_table_value_is_error_): Likewise.
* data/yacc.c (yytable_value_is_error): Likewise.
* data/lalr1.java (yy_table_value_is_error_): Likewise.
(yysyntax_error): Fix typo in code: use yytable_ not yycheck_.
* src/tables.h: In header comments, explain why it's useless to
check for a zero value in yytable.
---
 ChangeLog        |   14 ++
 data/bison.m4    |    2 +-
 data/glr.c       |    3 +-
 data/lalr1.cc    |    2 +-
 data/lalr1.java  |    6 +-
 data/yacc.c      |    3 +-
 src/parse-gram.c |  633 +++++++++++++++++++++++++++---------------------------
 src/parse-gram.h |    6 +-
 src/tables.c     |    2 +-
 src/tables.h     |   14 +-
 10 files changed, 353 insertions(+), 332 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 22a2a3c..89f863f 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,17 @@
+2009-08-26  Joel E. Denny  <jdenny@...>
+
+ Actually handle the yytable zero value correctly this time.
+ * data/bison.m4 (b4_integral_parser_tables_map): Don't mention
+ zero values in the YYTABLE comments.
+ * data/glr.c (yytable_value_is_error): Don't check for zero
+ value.
+ * data/lalr1.cc (yy_table_value_is_error_): Likewise.
+ * data/yacc.c (yytable_value_is_error): Likewise.
+ * data/lalr1.java (yy_table_value_is_error_): Likewise.
+ (yysyntax_error): Fix typo in code: use yytable_ not yycheck_.
+ * src/tables.h: In header comments, explain why it's useless to
+ check for a zero value in yytable.
+
 2009-08-25  Joel E. Denny  <jdenny@...>
 
  More fixes related to last two patches.
diff --git a/data/bison.m4 b/data/bison.m4
index ef8f778..0928721 100644
--- a/data/bison.m4
+++ b/data/bison.m4
@@ -266,7 +266,7 @@ $1([defgoto], [b4_defgoto], [[YYDEFGOTO[NTERM-NUM].]])
 $1([table], [b4_table],
    [[YYTABLE[YYPACT[STATE-NUM]].  What to do in state STATE-NUM.  If
 positive, shift that token.  If negative, reduce the rule which
-number is the opposite.  If zero or YYTABLE_NINF, syntax error.]])
+number is the opposite.  If YYTABLE_NINF, syntax error.]])
 
 $1([check], [b4_check])
 
diff --git a/data/glr.c b/data/glr.c
index 8d37070..6f1f7ee 100644
--- a/data/glr.c
+++ b/data/glr.c
@@ -972,8 +972,7 @@ yydefaultAction (yyStateNum yystate)
 }
 
 #define yytable_value_is_error(yytable_value) \
-  (]b4_table_value_equals([[table]], [[yytable_value]], [b4_table_ninf])[ \
-   || ]b4_table_value_equals([[table]], [[yytable_value]], [[0]])[)
+  ]b4_table_value_equals([[table]], [[yytable_value]], [b4_table_ninf])[
 
 /** Set *YYACTION to the action to take in YYSTATE on seeing YYTOKEN.
  *  Result R means
diff --git a/data/lalr1.cc b/data/lalr1.cc
index 36c8d58..8e79333 100644
--- a/data/lalr1.cc
+++ b/data/lalr1.cc
@@ -668,7 +668,7 @@ b4_percent_code_get[]dnl
   inline bool
   ]b4_parser_class_name[::yy_table_value_is_error_ (int yyvalue)
   {
-    return yyvalue == 0 || yyvalue == yytable_ninf_;
+    return yyvalue == yytable_ninf_;
   }
 
   int
diff --git a/data/lalr1.java b/data/lalr1.java
index 5f0cfd7..6637c3a 100644
--- a/data/lalr1.java
+++ b/data/lalr1.java
@@ -745,7 +745,7 @@ m4_popdef([b4_at_dollar])])dnl
             int count = 0;
             for (int x = yyxbegin; x < yyxend; ++x)
               if (yycheck_[x + yyn] == x && x != yyterror_
-                  && !yy_table_value_is_error_ (yycheck_[x + yyn]))
+                  && !yy_table_value_is_error_ (yytable_[x + yyn]))
                 ++count;
 
             // FIXME: This method of building the message is not compatible
@@ -757,7 +757,7 @@ m4_popdef([b4_at_dollar])])dnl
                 count = 0;
                 for (int x = yyxbegin; x < yyxend; ++x)
                   if (yycheck_[x + yyn] == x && x != yyterror_
-                      && !yy_table_value_is_error_ (yycheck_[x + yyn]))
+                      && !yy_table_value_is_error_ (yytable_[x + yyn]))
                     {
                       res.append (count++ == 0 ? ", expecting " : " or ");
                       res.append (yytnamerr_ (yytname_[x]));
@@ -785,7 +785,7 @@ m4_popdef([b4_at_dollar])])dnl
    */
   private static boolean yy_table_value_is_error_ (int yyvalue)
   {
-    return yyvalue == 0 || yyvalue == yytable_ninf_;
+    return yyvalue == yytable_ninf_;
   }
 
   private static final ]b4_int_type_for([b4_pact])[ yypact_ninf_ = ]b4_pact_ninf[;
diff --git a/data/yacc.c b/data/yacc.c
index 505b09e..163ecfe 100644
--- a/data/yacc.c
+++ b/data/yacc.c
@@ -531,8 +531,7 @@ static const ]b4_int_type_for([b4_toknum])[ yytoknum[] =
 #define YYTABLE_NINF ]b4_table_ninf[
 
 #define yytable_value_is_error(yytable_value) \
-  (]b4_table_value_equals([[table]], [[yytable_value]], [b4_table_ninf])[ \
-  || ]b4_table_value_equals([[table]], [[yytable_value]], [[0]])[)
+  ]b4_table_value_equals([[table]], [[yytable_value]], [b4_table_ninf])[
 
 ]b4_parser_tables_define[
 
diff --git a/src/tables.c b/src/tables.c
index f6cc073..d1f7e7d 100644
--- a/src/tables.c
+++ b/src/tables.c
@@ -126,7 +126,7 @@ static int table_size = 32768;
 base_number *table;
 base_number *check;
 /* The value used in TABLE to denote explicit syntax errors
-   (%nonassoc), a negative infinite.  First defaults to ACTION_NUMBER_MININUM,
+   (%nonassoc), a negative infinite.  First defaults to ACTION_NUMBER_MINIMUM,
    but in order to keep small tables, renumbered as TABLE_ERROR, which
    is the smallest (non error) value minus 1.  */
 base_number table_ninf = 0;
diff --git a/src/tables.h b/src/tables.h
index 9b52f09..02381e2 100644
--- a/src/tables.h
+++ b/src/tables.h
@@ -48,7 +48,7 @@
    YYFINAL = the state number of the termination state.
 
    YYTABLE = a vector filled with portions for different uses, found
-   via YYPACT and YYPGOTO.
+   via YYPACT and YYPGOTO, described below.
 
    YYLAST ( = high) the number of the last element of YYTABLE, i.e.,
    sizeof (YYTABLE) - 1.
@@ -87,7 +87,7 @@
 
    If the value is negative, it is minus a rule number to reduce by.
 
-   If the value is zero or YYTABLE_NINF, it's a syntax error.
+   If the value is YYTABLE_NINF, it's a syntax error.
 
    YYPGOTO[I] = the index in YYTABLE of the portion describing what to
    do after reducing a rule that derives variable I + NTOKENS.  This
@@ -99,6 +99,16 @@
    YYCHECK[YYPGOTO[I] + S] != S), then the default state (that is,
    YYDEFGOTO[I]) should be used instead of YYTABLE.  Otherwise,
    YYTABLE[YYPGOTO[I] + S] is the state to go to even if YYPGOTO[I] < 0.
+
+   When the above YYPACT, YYPGOTO, and YYCHECK tests determine that a
+   value from YYTABLE should be used, that value is never zero, so it is
+   useless to check for zero.  When those tests indicate that the value
+   from YYDEFACT or YYDEFGOTO should be used instead, the value from
+   YYTABLE *might* be zero, which, as a consequence of the way in which
+   the tables are constructed, also happens to indicate that YYDEFACT or
+   YYDEFGOTO should be used.  However, the YYTABLE value cannot be
+   trusted when the YYDEFACT or YYDEFGOTO value should be used.  In
+   summary, forget about zero values in YYTABLE.
 */
 
 extern int nvectors;
--
1.5.4.3




Re: Fix %error-verbose for conflicts resolved by %nonassoc.

by Akim Demaille :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Le 25 août 09 à 09:10, Joel E. Denny a écrit :

Hi Joel,

Nice and deep work!

> Verbose syntax error messages used to include error action tokens in  
> the
> expected token list.  There was a FIXME in conflicts.at about this.  
> To
> fix it, I pushed the following patches to master and similar ones to
> branch-2.5.  I feel this area is a bit too subtle to include this  
> change
> in branch-2.4.2.

I agree.

> I'm not proud of propagating more preprocessor macros to yacc.c.  
> Maybe
> this is a good place for some more functions like the ones Akim has
> discussed recently.  Also, we could probably find better names than  
> NINF.
> For example, error_action or default_action.

Sounds good to me.

> diff --git a/tests/conflicts.at b/tests/conflicts.at
> index 98b7d41..f2f7861 100644
> --- a/tests/conflicts.at
> +++ b/tests/conflicts.at
> @@ -99,20 +99,17 @@ AT_BISON_CHECK([-o input.c input.y])
> AT_COMPILE([input])
>
> AT_PARSER_CHECK([./input '0<0'])
> -# FIXME: This is an actual bug, but a new one, in the sense that
> -# no one has ever spotted it!  The messages are *wrong*: there should
> -# be nothing there, it should be expected eof.
> AT_PARSER_CHECK([./input '0<0<0'], [1], [],
> -         [syntax error, unexpected '<', expecting '<' or '>'
> +         [syntax error, unexpected '<'
> ])
>
> AT_PARSER_CHECK([./input '0>0'])
> AT_PARSER_CHECK([./input '0>0>0'], [1], [],
> -         [syntax error, unexpected '>', expecting '<' or '>'
> +         [syntax error, unexpected '>'
> ])
>
> AT_PARSER_CHECK([./input '0<0>0'], [1], [],
> -         [syntax error, unexpected '>', expecting '<' or '>'
> +         [syntax error, unexpected '>'
> ])

I don't understand why it does not refer to eof, as was noted in the  
comment.  I have not followed accurately everything you wrote about  
the NINF vs. 0 issue, but is it possible that the special case of end-
of-file, which number is 0, be a reason about it?  Maybe that would  
also explain my incorrect messages in the nearby thread about semantic  
error messages.




Re: Fix %error-verbose for conflicts resolved by %nonassoc.

by Akim Demaille :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Le 25 août 09 à 09:10, Joel E. Denny a écrit :

Hi Joel,

> diff --git a/data/yacc.c b/data/yacc.c
> index fa14cc5..9689223 100644
> --- a/data/yacc.c
> +++ b/data/yacc.c
> @@ -525,8 +525,14 @@ static const ]b4_int_type_for([b4_toknum])
> [ yytoknum[] =
>
> #define YYPACT_NINF ]b4_pact_ninf[
>
> +#define yyis_pact_ninf(yystate) \
> +  ]b4_table_value_equals([[pact]], [[yystate]], [b4_pact_ninf])[
> +
> #define YYTABLE_NINF ]b4_table_ninf[
>
> +#define yyis_table_ninf(yytable_value) \
> +  ]b4_table_value_equals([[table]], [[yytable_value]],  
> [b4_table_ninf])[

We try to stick to CamelCase for macro argument names.  I case see  
that this way it really looks like a function, but it is not :)


Re: Fix %error-verbose for conflicts resolved by %nonassoc.

by Joel E. Denny-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 26 Aug 2009, Akim Demaille wrote:

> > #define YYPACT_NINF ]b4_pact_ninf[
> >
> > +#define yyis_pact_ninf(yystate) \
> > +  ]b4_table_value_equals([[pact]], [[yystate]], [b4_pact_ninf])[
> > +
> > #define YYTABLE_NINF ]b4_table_ninf[
> >
> > +#define yyis_table_ninf(yytable_value) \
> > +  ]b4_table_value_equals([[table]], [[yytable_value]], [b4_table_ninf])[
>
> We try to stick to CamelCase for macro argument names.  I case see that this
> way it really looks like a function, but it is not :)

Ah, thanks.  I had copied from glr.c, but I've noticed in the past that
glr.c has its own style.

But what about making this a static inline function?  Or does this
discussion need to wait until we abandon K&R?



Re: Fix %error-verbose for conflicts resolved by %nonassoc.

by Joel E. Denny-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 26 Aug 2009, Akim Demaille wrote:

> > AT_PARSER_CHECK([./input '0<0'])
> > -# FIXME: This is an actual bug, but a new one, in the sense that
> > -# no one has ever spotted it!  The messages are *wrong*: there should
> > -# be nothing there, it should be expected eof.
> > AT_PARSER_CHECK([./input '0<0<0'], [1], [],
> > -         [syntax error, unexpected '<', expecting '<' or '>'
> > +         [syntax error, unexpected '<'
> > ])
> >
> > AT_PARSER_CHECK([./input '0>0'])
> > AT_PARSER_CHECK([./input '0>0>0'], [1], [],
> > -         [syntax error, unexpected '>', expecting '<' or '>'
> > +         [syntax error, unexpected '>'
> > ])
> >
> > AT_PARSER_CHECK([./input '0<0>0'], [1], [],
> > -         [syntax error, unexpected '>', expecting '<' or '>'
> > +         [syntax error, unexpected '>'
> > ])
>
> I don't understand why it does not refer to eof, as was noted in the comment.

Thanks for mentioning that.  I meant to address it, but I fell down a
different rabbit hole and forgot.

%error-verbose cannot be completely trusted without canonical LR.  There
are two problems: default reductions, which lose lookaheads, and state
merging, which loses and adds lookaheads.

Normally the parser cannot detect any syntax error when in a state with a
default reduction because all lookaheads look valid even if they aren't.  
In that case, after performing the default reduction, the parser enters a
new state where some valid lookaheads from the previous state might not be
valid lookaheads anymore.  Thus, lookaheads are lost in the eventual
syntax error message.

Your grammar has a special exception: it has an explicit syntax error in a
state with a default reduction.  Thus, that particular syntax error is
possible to detect without performing the default reduction.  
Nevertheless, the list of valid lookaheads for the default reduction is
not recorded, so they aren't reported in the syntax error message.  $end
is one such valid lookahead.

We can stop the loss of lookaheads due to default reductions by disabling
default reductions for inconsistent states:

  %define lr.default-reductions "consistent"

A consistent state has one action.  If that action is shift, the one valid
lookahead is known.  If that action is reduce, it can become a default
reduction.  That drops the lookahead set, but that's fine.  Because the
reduction is the only action in this state, it will always be performed,
and the valid lookaheads for this left context are always the same as (or,
due to state merging, a subset of) the lookaheads for the state that
follows the subsequent goto.  They have to be the same because what can
come next cannot be restricted by choosing the only possible action.  No
valid lookaheads are lost.

I pushed the patch below to extend your test case in order to show the
above solution.

Though it does not affect your test case, we still have not solved the
state merging problem.  That is, when lookahead sets are merged for
different left contexts, some invalid lookaheads might now look valid and
thus might be reported as expected tokens.  Moreover, as with default
reductions, invalid reductions might be performed leading to a restricted
set of lookaheads.  Writing an LALR grammar does not prevent state merging
and thus does not solve this problem in general, and neither does using
IELR.  Canonical LR is required:

  %define lr.type "canonical LR"

A side effect of that declaration is that default reductions are disabled
completely by default, so you no longer need the previous declaration.  
However, if you always want to construct as much of the semantic left
context as possible before reading the next token, you should enable
default reductions in consistent states by combining the above two
declarations.

(To be clear, even though IELR cannot fix %error-verbose, it is still
worthwhile during development when an accurate %error-verbose is
ultimately needed.  I've found that the number of conflicts that must be
debugged in canonical LR parser tables tends to be an order of magnitude
larger than in LALR and IELR.  So, if you don't mind canonical LR's size
in your final product in order to get perfect lookahead sets, debug your
conflicts with IELR, and then switch to canonical LR.)

I really need to rewrite the lr.default-reductions and lr.type
documentation.  A cross-reference from %error-verbose would be nice.  
I'll try to get to that before the 2.5 release.

> Maybe that would also explain my incorrect
> messages in the nearby thread about semantic error messages.

I skimmed that, and I would guess that you need canonical LR.  I haven't
explored your grammar yet though.

>From d1cc31c5f04b81a3620fa291020ce23490f3f9e7 Mon Sep 17 00:00:00 2001
From: Joel E. Denny <jdenny@...>
Date: Wed, 26 Aug 2009 14:15:53 -0400
Subject: [PATCH] tests: show a use of %define lr.default-reductions "consistent"

* tests/conflicts.at (%nonassoc and eof): Extend to test that it
prevents the omission of expected tokens for %error-verbose.
---
 ChangeLog          |    6 ++++++
 tests/conflicts.at |   23 +++++++++++++++++++++++
 2 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 40df1d4..44f27ff 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2009-08-26  Joel E. Denny  <jdenny@...>
+
+ tests: show a use of %define lr.default-reductions "consistent"
+ * tests/conflicts.at (%nonassoc and eof): Extend to test that it
+ prevents the omission of expected tokens for %error-verbose.
+
 2009-08-26  Akim Demaille  <demaille@...>
 
  tests: portability fix.
diff --git a/tests/conflicts.at b/tests/conflicts.at
index f2f7861..26ec08d 100644
--- a/tests/conflicts.at
+++ b/tests/conflicts.at
@@ -112,6 +112,29 @@ AT_PARSER_CHECK([./input '0<0>0'], [1], [],
          [syntax error, unexpected '>'
 ])
 
+# We must disable default reductions in inconsistent states in order to
+# have an explicit list of all expected tokens.  (However, unless we use
+# canonical LR, lookahead sets are merged for different left contexts,
+# so it is still possible to have extra incorrect tokens in the expected
+# list.  That just doesn't happen to be a problem for this test case.)
+
+AT_BISON_CHECK([-Dlr.default-reductions=consistent -o input.c input.y])
+AT_COMPILE([input])
+
+AT_PARSER_CHECK([./input '0<0'])
+AT_PARSER_CHECK([./input '0<0<0'], [1], [],
+         [syntax error, unexpected '<', expecting $end
+])
+
+AT_PARSER_CHECK([./input '0>0'])
+AT_PARSER_CHECK([./input '0>0>0'], [1], [],
+         [syntax error, unexpected '>', expecting $end
+])
+
+AT_PARSER_CHECK([./input '0<0>0'], [1], [],
+         [syntax error, unexpected '>', expecting $end
+])
+
 AT_CLEANUP
 
 
--
1.5.4.3




Re: Fix %error-verbose for conflicts resolved by %nonassoc.

by Akim Demaille :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Le 27 août 09 à 00:21, Joel E. Denny a écrit :

> Thanks for mentioning that.  I meant to address it, but I fell down a
> different rabbit hole and forgot.
> [...]

Your explanations are crystal clear.  It's a very nice basis to  
address the following points:

> I really need to rewrite the lr.default-reductions and lr.type
> documentation.  A cross-reference from %error-verbose would be nice.
> I'll try to get to that before the 2.5 release.

Thanks for taking the time to analyse this!

>> Maybe that would also explain my incorrect
>> messages in the nearby thread about semantic error messages.
>
> I skimmed that, and I would guess that you need canonical LR.  I  
> haven't
> explored your grammar yet though.

I don't think you have the grammar I was referring too, or maybe an  
old version of it.




Re: several messages

by Joel E. Denny-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Akim.

On Wed, 2 Sep 2009, Akim Demaille wrote:

> I played with canonical-lr, but it is way too expensive for me, both in terms
> of size, and duration of compilation (maybe in a release mode, but...).

By compilation, you mean gcc?  Or is bison slow?

> But
> toying with lr.default-reductions (still in lalr) gives very good results.
>
> I have 382 LR(0) states, and 12941 is canonical LR(1).

That's quite a jump.  I usually see only one more digit.  There must be
many similar contexts within your grammar.  What does IELR(1) give you?  
How long does bison take to run for IELR(1) versus LALR(1)?

> Here are the size of the parser files.
>
> akim@montero ~/src/gnet/kernel $ ls -l _build/*/ugrammar.cc
> 12:40:05
> -rw-r--r--  1 akim  akim   321287 Sep  2 12:10 _build/accepting/ugrammar.cc
> -rw-r--r--  1 akim  akim   170357 Sep  2 12:13 _build/all/ugrammar.cc
> -rw-r--r--  1 akim  akim  8913314 Sep  2 11:52 _build/canonical-lr/ugrammar.cc
> -rw-r--r--  1 akim  akim   263165 Sep  2 12:06 _build/consistent/ugrammar.cc

It's interesting that each value for lr.default-reductions has a
significant impact on size, but at least it's not orders of magnitude as
for canonical LR.

> The syntax error on "1 1" behaves as expected with both accepting and
> consistent (i.e., mute since there are too many possibilities ;).

If you could unmute it (by adding a few more possibilities in the
skeleton), it would be interesting to see if canonical LR gives the same
expected tokens list.

> So I will proceed with default-reductions=all on space-starving targets (on
> which anyway I use simple error messages to get rid of the tables), and
> "accepting" otherwise.

Why not "consistent"?  For error messages, I don't know of any reason to
expect either to typically be better than the other.  But "consistent"
produces the traditional Yacc behavior (actually defined in POSIX) of not
requesting a token from yylex until necessary.  This can affect semantic
actions.

The only reason I made canonical LR choose "accepting" by default is
because I think of that as being the canonical form.

> The documentation for error-verbose should probably
> promote the use of accepting/consistent.

They can sometimes worsen the error messages.  That is, in some cases,
lookahead merging might produce worse effects on the expected list in the
state containing the default reduction than is produced by performing the
default reduction.  For example, consider the input "aaz" for this
grammar:

  %token 'z';
  S: 'a' A 'a'
   | 'b' A B
   ;
  A: 'a' ;
  B: 'a' | 'b' | 'c' | 'd' ;

With lr.type=canonical-lr or lr.type=lalr:

  syntax error, unexpected 'z', expecting 'a'

But with lr.type=lalr and lr.default-reductions=accepting:

  syntax error, unexpected 'z', expecting 'a' or 'b' or 'c' or 'd'

If we add:

  %left 'a';
  A: 'a' 'a';

Then lr.default-reductions=consistent has the same problem.

My suspicion would have been that disabling default reductions almost
always does more good than harm, but I think we need more exploration.

On Wed, 2 Sep 2009, Akim Demaille wrote:

> Le 27 août 09 à 00:21, Joel E. Denny a écrit :
>
> > Thanks for mentioning that.  I meant to address it, but I fell down a
> > different rabbit hole and forgot.
> > [...]
>
> Your explanations are crystal clear.  It's a very nice basis to address the
> following points:
>
> > I really need to rewrite the lr.default-reductions and lr.type
> > documentation.  A cross-reference from %error-verbose would be nice.
> > I'll try to get to that before the 2.5 release.
>
> Thanks for taking the time to analyse this!
Thanks for sharing your results!

> > > Maybe that would also explain my incorrect
> > > messages in the nearby thread about semantic error messages.
> >
> > I skimmed that, and I would guess that you need canonical LR.  I haven't
> > explored your grammar yet though.
>
> I don't think you have the grammar I was referring too, or maybe an old
> version of it.

For some reason, I thought you meant the calc++ example grammar, so that's
what I looked at.

Re: several messages

by Akim Demaille-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Joel!

Le 3 sept. 09 à 07:35, Joel E. Denny a écrit :

> Hi Akim.
>
> On Wed, 2 Sep 2009, Akim Demaille wrote:
>
>> I played with canonical-lr, but it is way too expensive for me,  
>> both in terms
>> of size, and duration of compilation (maybe in a release mode,  
>> but...).
>
> By compilation, you mean gcc?  Or is bison slow?
I meant bison.

>> But
>> toying with lr.default-reductions (still in lalr) gives very good  
>> results.
>>
>> I have 382 LR(0) states, and 12941 is canonical LR(1).
>
> That's quite a jump.  I usually see only one more digit.  There must  
> be
> many similar contexts within your grammar.  What does IELR(1) give  
> you?
> How long does bison take to run for IELR(1) versus LALR(1)?
My grammar has:
- 185 rules
- 104 terminals
- 58 nonterminals

Here are the figures:

> bison=./_build/debug/bison/tests/bison
> src=$HOME/src/kernel
> build=$src/_build/debug
> includes="-I$build/src -I$src/src -I$src/include -I$src/sdk-remote/
> include -I$build/sdk-remote/libport/include -I$src/sdk-remote/
> libport/include -I$build/include -I$src/kernel -I/opt/local/include"
> for lr in lalr ielr canonical-lr
> do
>  for red in all consistent accepting
>  do
>    echo
>    echo $lr-$red
>    TIMEFMT="Bison %E real  %U user  %S system"
>    time $bison --trace=time -Dlr.type=$lr -Dlr.default-reductions=
> $red src/parser/ugrammar.y -o /tmp/ugrammar.cc
>    TIMEFMT="G++   %E real  %U user  %S system"
>    time g++ -DHAVE_CONFIG_H $=includes -c /tmp/ugrammar.cc -o /tmp/
> ugrammar.o
>    $bison -Dlr.type=$lr -Dlr.default-reductions=$red src/parser/
> ugrammar.y -o /tmp/ugrammar.cc -rall
>    sed -n '/^state [0-9]*$/h;${x;p;};' /tmp/ugrammar.output
>    gls -sh /tmp/ugrammar.*
>  done
> done |& tee log.txt


lalr-all

Execution times (seconds)
 LALR(1)               :   0.01 ( 2%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 outputing parser      :   0.01 ( 2%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 running m4            :   0.47 (96%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 TOTAL                 :   0.49             0.00             0.00
Bison 0.55s real  0.50s user  0.02s system
G++   5.99s real  5.34s user  0.60s system
state 382
164K /tmp/ugrammar.cc
 52K /tmp/ugrammar.hh
836K /tmp/ugrammar.o
672K /tmp/ugrammar.output

lalr-consistent

Execution times (seconds)
 LALR(1)               :   0.01 ( 2%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 parser action tables  :   0.01 ( 2%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 outputing parser      :   0.01 ( 2%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 running m4            :   0.47 (94%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 TOTAL                 :   0.50             0.00             0.00
Bison 0.55s real  0.51s user  0.01s system
G++   6.01s real  5.38s user  0.60s system
state 382
256K /tmp/ugrammar.cc
 52K /tmp/ugrammar.hh
860K /tmp/ugrammar.o
764K /tmp/ugrammar.output

lalr-accepting

Execution times (seconds)
 LALR(1)               :   0.01 ( 2%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 parser action tables  :   0.02 ( 4%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 outputing parser      :   0.02 ( 4%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 running m4            :   0.47 (90%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 TOTAL                 :   0.52             0.00             0.00
Bison 0.55s real  0.53s user  0.01s system
G++   6.19s real  5.42s user  0.60s system
state 382
312K /tmp/ugrammar.cc
 52K /tmp/ugrammar.hh
876K /tmp/ugrammar.o
896K /tmp/ugrammar.output

ielr-all

Execution times (seconds)
 LALR(1)               :   0.01 ( 2%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 IELR(1) Phase 2       :   0.03 ( 6%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 IELR(1) Phase 3       :   0.02 ( 4%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 outputing parser      :   0.02 ( 4%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 running m4            :   0.46 (85%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 TOTAL                 :   0.54             0.00             0.00
Bison 0.56s real  0.55s user  0.01s system
G++   5.98s real  5.34s user  0.59s system
state 382
164K /tmp/ugrammar.cc
 52K /tmp/ugrammar.hh
836K /tmp/ugrammar.o
672K /tmp/ugrammar.output

ielr-consistent

Execution times (seconds)
 LALR(1)               :   0.01 ( 2%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 IELR(1) Phase 2       :   0.03 ( 5%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 IELR(1) Phase 3       :   0.02 ( 4%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 parser action tables  :   0.01 ( 2%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 outputing parser      :   0.02 ( 4%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 running m4            :   0.46 (84%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 TOTAL                 :   0.55             0.00             0.00
Bison 0.58s real  0.56s user  0.01s system
G++   6.35s real  5.42s user  0.61s system
state 382
256K /tmp/ugrammar.cc
 52K /tmp/ugrammar.hh
860K /tmp/ugrammar.o
764K /tmp/ugrammar.output

ielr-accepting

Execution times (seconds)
 LALR(1)               :   0.01 ( 2%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 IELR(1) Phase 2       :   0.03 ( 5%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 IELR(1) Phase 3       :   0.02 ( 4%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 parser action tables  :   0.03 ( 5%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 outputing parser      :   0.02 ( 4%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 running m4            :   0.46 (81%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 TOTAL                 :   0.57             0.00             0.00
Bison 0.60s real  0.59s user  0.01s system
G++   6.12s real  5.41s user  0.60s system
state 382
312K /tmp/ugrammar.cc
 52K /tmp/ugrammar.hh
876K /tmp/ugrammar.o
896K /tmp/ugrammar.output

canonical-lr-all

Execution times (seconds)
 IELR(1) Phase 3       :   0.33 (10%) usr   0.01 (10%) sys   0.00 ( 0%) wall
 IELR(1) Phase 4       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 conflicts             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 parser action tables  :   2.19 (65%) usr   0.01 (10%) sys 128.00 (100%) wall
 outputing parser      :   0.16 ( 5%) usr   0.02 (20%) sys   0.00 ( 0%) wall
 running m4            :   0.65 (19%) usr   0.05 (50%) sys   0.00 ( 0%) wall
 freeing               :   0.01 ( 0%) usr   0.01 (10%) sys   0.00 ( 0%) wall
 TOTAL                 :   3.36             0.10           128.00
Bison 3.49s real  3.37s user  0.11s system
G++   7.85s real  6.76s user  0.76s system
state 12941
2.7M /tmp/ugrammar.cc
 52K /tmp/ugrammar.hh
1.6M /tmp/ugrammar.o
 25M /tmp/ugrammar.output

canonical-lr-consistent

Execution times (seconds)
 IELR(1) Phase 3       :   0.36 ( 2%) usr   0.01 ( 4%) sys   0.00 ( 0%) wall
 IELR(1) Phase 4       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 conflicts             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 parser action tables  :  16.38 (91%) usr   0.05 (18%) sys   0.00 ( 0%) wall
 outputing parser      :   0.34 ( 2%) usr   0.08 (29%) sys   0.00 ( 0%) wall
 running m4            :   0.89 ( 5%) usr   0.13 (46%) sys   0.00 ( 0%) wall
 freeing               :   0.02 ( 0%) usr   0.01 ( 4%) sys   0.00 ( 0%) wall
 TOTAL                 :  18.01             0.28             0.00
Bison 18.64s real  18.02s user  0.30s system
G++   10.63s real  9.15s user  0.96s system
state 12941
5.6M /tmp/ugrammar.cc
 52K /tmp/ugrammar.hh
2.3M /tmp/ugrammar.o
 27M /tmp/ugrammar.output

canonical-lr-accepting

Execution times (seconds)
 reader                :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 IELR(1) Phase 3       :   0.37 ( 1%) usr   0.01 ( 2%) sys   0.00 ( 0%) wall
 IELR(1) Phase 4       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 conflicts             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 parser action tables  :  46.08 (95%) usr   0.21 (34%) sys   0.00 ( 0%) wall
 outputing parser      :   0.61 ( 1%) usr   0.12 (20%) sys   0.00 ( 0%) wall
 running m4            :   1.28 ( 3%) usr   0.26 (43%) sys   0.00 ( 0%) wall
 freeing               :   0.01 ( 0%) usr   0.01 ( 2%) sys   0.00 ( 0%) wall
 TOTAL                 :  48.38             0.61             0.00
Bison 49.65s real  48.39s user  0.63s system
G++   13.04s real  11.08s user  1.03s system
state 12941
8.5M /tmp/ugrammar.cc
 52K /tmp/ugrammar.hh
3.2M /tmp/ugrammar.o
 31M /tmp/ugrammar.output



It would be nice to extend the *.output file with various metrics  
about the grammar.

>> Here are the size of the parser files.
>>
>> akim@montero ~/src/gnet/kernel $ ls -l _build/*/ugrammar.cc
>> 12:40:05
>> -rw-r--r--  1 akim  akim   321287 Sep  2 12:10 _build/accepting/
>> ugrammar.cc
>> -rw-r--r--  1 akim  akim   170357 Sep  2 12:13 _build/all/ugrammar.cc
>> -rw-r--r--  1 akim  akim  8913314 Sep  2 11:52 _build/canonical-lr/
>> ugrammar.cc
>> -rw-r--r--  1 akim  akim   263165 Sep  2 12:06 _build/consistent/
>> ugrammar.cc
>
> It's interesting that each value for lr.default-reductions has a
> significant impact on size, but at least it's not orders of  
> magnitude as
> for canonical LR.
Yes, agreed.

>> The syntax error on "1 1" behaves as expected with both accepting and
>> consistent (i.e., mute since there are too many possibilities ;).
>
> If you could unmute it (by adding a few more possibilities in the
> skeleton), it would be interesting to see if canonical LR gives the  
> same
> expected tokens list.

One way out of the fixed arity would be to use "note: " lines, as Alex  
suggested it (and I'm slowly changing my mind on this regard).


>> So I will proceed with default-reductions=all on space-starving  
>> targets (on
>> which anyway I use simple error messages to get rid of the tables),  
>> and
>> "accepting" otherwise.
>
> Why not "consistent"?  For error messages, I don't know of any  
> reason to
> expect either to typically be better than the other.  But "consistent"
> produces the traditional Yacc behavior (actually defined in POSIX)  
> of not
> requesting a token from yylex until necessary.  This can affect  
> semantic
> actions.
I don't see value in this feature in my case.  I guess that this is  
precious in languages like C where the token kind of identifiers  
depends on the context.

> The only reason I made canonical LR choose "accepting" by default is
> because I think of that as being the canonical form.

OK, I'll use consistent, thanks!  It is significantly smaller.


>> The documentation for error-verbose should probably
>> promote the use of accepting/consistent.
>
> They can sometimes worsen the error messages.  That is, in some cases,
> lookahead merging might produce worse effects on the expected list  
> in the
> state containing the default reduction than is produced by  
> performing the
> default reduction.  For example, consider the input "aaz" for this
> grammar:
>
> %token 'z';
> S: 'a' A 'a'
>  | 'b' A B
>  ;
> A: 'a' ;
> B: 'a' | 'b' | 'c' | 'd' ;
Wow.  You do have a collection of interesting grammars at hand :)

> With lr.type=canonical-lr or lr.type=lalr:
>
> syntax error, unexpected 'z', expecting 'a'
>
> But with lr.type=lalr and lr.default-reductions=accepting:
>
> syntax error, unexpected 'z', expecting 'a' or 'b' or 'c' or 'd'
>
> If we add:
>
> %left 'a';
> A: 'a' 'a';
>
> Then lr.default-reductions=consistent has the same problem.
>
> My suspicion would have been that disabling default reductions almost
> always does more good than harm, but I think we need more exploration.
So if we can't trust static tables, should we actually try each  
possible token one after the other, and see if one can be shifted at  
some point?  Of course without touching the stack, that can be quite a  
pain to write :(  But it should not be really different from our error-
recovery code.

Do I also understand that we neither have a superset nor a subset,  
just some fuzzyset?


>>>> Maybe that would also explain my incorrect
>>>> messages in the nearby thread about semantic error messages.
>>>
>>> I skimmed that, and I would guess that you need canonical LR.  I  
>>> haven't
>>> explored your grammar yet though.
>>
>> I don't think you have the grammar I was referring too, or maybe an  
>> old
>> version of it.
>
> For some reason, I thought you meant the calc++ example grammar,
Maybe it features the same phenomenon indeed, I didn't check.

> so that's
> what I looked at.

I can send ugrammar.y privately, if you wish.

Re: several messages

by Joel E. Denny-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Akim!

On Mon, 7 Sep 2009, Akim Demaille wrote:

> Le 3 sept. 09 à 07:35, Joel E. Denny a écrit :

> > By compilation, you mean gcc?  Or is bison slow?
>
> I meant bison.

Bummer.  But I have noticed canonical LR can be a little slower to
generate.

> Here are the figures:

Thanks for all that!

> My grammar has:
> - 185 rules
> - 104 terminals
> - 58 nonterminals

How many S/R and R/R conflicts?  I'm also interested in conflicts that are
resolved by prec/assoc declarations.

> lalr-all
>
> Execution times (seconds)
>  LALR(1)               :   0.01 ( 2%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  outputing parser      :   0.01 ( 2%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  running m4            :   0.47 (96%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  TOTAL                 :   0.49             0.00             0.00
> Bison 0.55s real  0.50s user  0.02s system
> G++   5.99s real  5.34s user  0.60s system
> state 382
> 164K /tmp/ugrammar.cc
>  52K /tmp/ugrammar.hh
> 836K /tmp/ugrammar.o
> 672K /tmp/ugrammar.output

> ielr-all
>
> Execution times (seconds)
>  LALR(1)               :   0.01 ( 2%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  IELR(1) Phase 2       :   0.03 ( 6%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  IELR(1) Phase 3       :   0.02 ( 4%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  outputing parser      :   0.02 ( 4%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  running m4            :   0.46 (85%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  TOTAL                 :   0.54             0.00             0.00
> Bison 0.56s real  0.55s user  0.01s system
> G++   5.98s real  5.34s user  0.59s system
> state 382
> 164K /tmp/ugrammar.cc
>  52K /tmp/ugrammar.hh
> 836K /tmp/ugrammar.o
> 672K /tmp/ugrammar.output
It's good to see that the IELR computation time is barely worse than LALR.  
Because IELR didn't split any states, the output files are of course
exactly the same.  However, that also means there may not have been much
for IELR to do, and so the timings may not reveal much about IELR's
complexity.  The number of conflicts would provide some insight here...
but maybe I'm getting off topic a little.

(By the way, I've been meaning to add a test case for Bison that makes
sure the LALR and IELR parser tables are exactly the same for
parse-gram.y.  If they ever become different, we'll need to fix
parse-gram.y or start using IELR.)

> canonical-lr-all
>
> Execution times (seconds)
>  IELR(1) Phase 3       :   0.33 (10%) usr   0.01 (10%) sys   0.00 ( 0%) wall
>  IELR(1) Phase 4       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  conflicts             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  parser action tables  :   2.19 (65%) usr   0.01 (10%) sys 128.00 (100%) wall
>  outputing parser      :   0.16 ( 5%) usr   0.02 (20%) sys   0.00 ( 0%) wall
>  running m4            :   0.65 (19%) usr   0.05 (50%) sys   0.00 ( 0%) wall
>  freeing               :   0.01 ( 0%) usr   0.01 (10%) sys   0.00 ( 0%) wall
>  TOTAL                 :   3.36             0.10           128.00
> Bison 3.49s real  3.37s user  0.11s system
> G++   7.85s real  6.76s user  0.76s system
> state 12941
> 2.7M /tmp/ugrammar.cc
>  52K /tmp/ugrammar.hh
> 1.6M /tmp/ugrammar.o
>  25M /tmp/ugrammar.output
I'm glad to see that most of the extra time does not fall in the original
parser table computation.  It just takes a really long time to compress
and transform those huge tables into the output code.

> canonical-lr-consistent
>
> Execution times (seconds)
>  IELR(1) Phase 3       :   0.36 ( 2%) usr   0.01 ( 4%) sys   0.00 ( 0%) wall
>  IELR(1) Phase 4       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  conflicts             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  parser action tables  :  16.38 (91%) usr   0.05 (18%) sys   0.00 ( 0%) wall
>  outputing parser      :   0.34 ( 2%) usr   0.08 (29%) sys   0.00 ( 0%) wall
>  running m4            :   0.89 ( 5%) usr   0.13 (46%) sys   0.00 ( 0%) wall
>  freeing               :   0.02 ( 0%) usr   0.01 ( 4%) sys   0.00 ( 0%) wall
>  TOTAL                 :  18.01             0.28             0.00
> Bison 18.64s real  18.02s user  0.30s system
> G++   10.63s real  9.15s user  0.96s system
> state 12941
> 5.6M /tmp/ugrammar.cc
>  52K /tmp/ugrammar.hh
> 2.3M /tmp/ugrammar.o
>  27M /tmp/ugrammar.output
Wow.  What a jump.  I can see why you don't want this.  I have to admit
that I haven't spent much time exploring the combination of canonical-lr
and different values of lr.default-reductions.  Maybe it's bad that I
defaulted it to "accepting"?  I just didn't want anyone to argue that they
specified canonical LR but didn't get it.

> canonical-lr-accepting
>
> Execution times (seconds)
>  reader                :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  IELR(1) Phase 3       :   0.37 ( 1%) usr   0.01 ( 2%) sys   0.00 ( 0%) wall
>  IELR(1) Phase 4       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  conflicts             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
>  parser action tables  :  46.08 (95%) usr   0.21 (34%) sys   0.00 ( 0%) wall
>  outputing parser      :   0.61 ( 1%) usr   0.12 (20%) sys   0.00 ( 0%) wall
>  running m4            :   1.28 ( 3%) usr   0.26 (43%) sys   0.00 ( 0%) wall
>  freeing               :   0.01 ( 0%) usr   0.01 ( 2%) sys   0.00 ( 0%) wall
>  TOTAL                 :  48.38             0.61             0.00
> Bison 49.65s real  48.39s user  0.63s system
> G++   13.04s real  11.08s user  1.03s system
> state 12941
> 8.5M /tmp/ugrammar.cc
>  52K /tmp/ugrammar.hh
> 3.2M /tmp/ugrammar.o
>  31M /tmp/ugrammar.output
Ouch.  Well, more motivation to steer clear of canonical LR.

> It would be nice to extend the *.output file with various metrics  
> about the grammar.

Agreed.

> >> The syntax error on "1 1" behaves as expected with both accepting and
> >> consistent (i.e., mute since there are too many possibilities ;).
> >
> > If you could unmute it (by adding a few more possibilities in the
> > skeleton), it would be interesting to see if canonical LR gives the  
> > same
> > expected tokens list.

I was interested in extending it temporarily to compare the expected
tokens for your specific grammar.  Of course, I also would like to see it
fixed in general....

> One way out of the fixed arity would be to use "note: " lines, as Alex  
> suggested it (and I'm slowly changing my mind on this regard).

I was thinking we were both only opposed to dropping the warning label.  
I like the idea of a special function for formatting submessages.  I'm not
so sure about the "note: " prefix though.

However, for the generated parser rather than for bison, strings fed to
yyerror currently don't contain newlines.  I'm not sure how important or
how long-standing that convention has been.  If we move to a %error-report
or something like that, that might be a good opportunity to break the
convention... or to let the user take control.

> >> So I will proceed with default-reductions=all on space-starving  
> >> targets (on
> >> which anyway I use simple error messages to get rid of the tables),  
> >> and
> >> "accepting" otherwise.
> >
> > Why not "consistent"?  For error messages, I don't know of any  
> > reason to
> > expect either to typically be better than the other.  But "consistent"
> > produces the traditional Yacc behavior (actually defined in POSIX)  
> > of not
> > requesting a token from yylex until necessary.  This can affect  
> > semantic
> > actions.
>
> I don't see value in this feature in my case.  I guess that this is  
> precious in languages like C where the token kind of identifiers  
> depends on the context.
I can understand that.  It doesn't matter for all parsers.

> Wow.  You do have a collection of interesting grammars at hand :)

Well, I came up with those toy grammars for that email.  I didn't mean to
imply that they were in any way practical or typical.  They just prove the
existence of a flaw, and that might later prove to affect the real world.
But maybe I'm reading too much into what you're saying.  :)

> > My suspicion would have been that disabling default reductions almost
> > always does more good than harm, but I think we need more exploration.
>
> So if we can't trust static tables, should we actually try each  
> possible token one after the other, and see if one can be shifted at  
> some point?  Of course without touching the stack, that can be quite a  
> pain to write :(  But it should not be really different from our error-
> recovery code.

You're reading my mind.  :)  I think we've realized for a while that
%error-verbose is not perfect.  However, soon after our current discussion
started, the extent of the problems began to sink in for me, and I
realized the implications for my own work, so I started working on a
solution very similar to what you describe.  I'm nearly done with the
coding.  I might be able to push a candidate soon.

However, there's one problem with your suggestion.  By the time the syntax
error is reported, it's too late.  The parser may have already performed
some default reductions (or otherwise invalid reductions due to state
merging) and thus lost the original context recorded on the stack.  The
stack popping that error recovery does is not necessarily sufficient to
restore the original stack because reductions may have removed parts of
the stack permanently.

Thus, immediately upon fetching the token (it's fine if the fetch comes
after performing some default reductions in consistent states), the parser
needs to determine whether the token can eventually be shifted.  That is,
the parser first tries the parse without performing semantic actions,
without manipulating the value and location stacks, and without losing the
original state stack.  If it successfully reaches a shift, it then repeats
the parse on all the normal stacks.  If unsuccessful instead, it
immediately reports a syntax error and then repeats the exploration for
every token, as you suggested, so it can report what tokens were expected.  
During each such exploration (during parsing or at error recovery), the
parser need not duplicate the normal parser state stack in order to avoid
losing the original context.  Instead, it keeps a temporary stack that
records only the exploratory states that are not already recorded on the
normal parser stack.  It also keeps a pointer into the normal parser stack
in order to record the base of the temporary stack.  That pointer may move
during exploratory reductions.  I've implemented all this already in
yacc.c, and it functions correctly so far.  I'm just tidying up memory
management at the moment, and I need to do more testing.

I've been calling this lookahead correction technique "LAC" (LookAhead
Correction) for lack of anything better.  Maybe LR LAC?  The %define
variable is parse.lac for now, but I realize that's yet another
abbreviation.

At worst, LAC should no more than double the parsing time.  That is, even
though it's similar to GLR or backtracking, it tries only one possible
parse on any token up to its shift before performing the parse for real or
reporting a syntax error.  (Also unlike GLR and backtracking, we still
resolve conflicts statically.  The parser is still deterministic.)  My
suspicion is that, given the time for file I/O, lexical analysis, and
semantic actions, the total run time of the parser will not be affected
severely by this doubling.  Of course, building a syntax error message
will take longer than double, but I'm not as worried about that.  When I
finish implementing, I'll try collecting some statistics.

Another interesting point is that, with LAC, IELR should always perform
exactly the same semantic actions as canonical LR even for syntactically
incorrect sentences.  Previously, IELR could only claim that for
syntactically correct sentences.  (The same can be said for LALR as long
as there are no mysterious conflicts, but, unless there are no conflicts
at all, that's hard to verify without an IELR implementation anyway.)  
One can even imagine using LAC in conjunction with a push parser to
implement, for example, tab completions.

Keep in mind that, with LAC, the only possible difference between
lr.default-reductions=all or =consistent for any fixed value of lr.type
should be parser tables sizes and performance (that is, possibly many more
reductions to explore before permitting normal parsing), which would be
interesting to collect statistics on.  But there should be no change in
basic parsing behavior or error messages.  For canonical LR only, LAC is
redundant unless lr.default-reductions=all.  As always,
lr.default-reductions=accepting can alter how semantic actions are
performed because the use of default reductions in consistent states
affects when a token is fetched and thus when LAC starts for that token.

The first approach I thought of for correcting lookaheads was a little
different than what I describe above.  Upon fetching each token, the
parser would proceed parsing normally but would begin to push all states
popped from the normal parser stack onto a temporary stack (popped
semantic values and locations would not be stored) and would again keep a
pointer to its base in the normal parser stack.  Upon shifting the token,
the temporary stack would be cleared.  Upon encountering an error instead,
the temporary stack would enable the parser to remember what the original
state stack looked like upon fetching the token, and yysyntax_error could
then explore the parse for all tokens as in the previous approach to see
which tokens were expected.  The main loss with this approach would be
that canonical LR semantic actions would no longer be perfectly emulated
as in the previous approach.  The implementation would also be more
complicated.  Originally, I thought the gain might be a lower performance
impact before a syntax error is detected, but now I'm not so sure this
approach is much different than doubling the parser actions as in the
previous approach.  The main difference seems to be that, in this
approach, the parser tables need not be decoded a second time in order to
manipulate the temporary stack.  Well, if the previous approach has poor
performance, I might try this approach just in case it's somehow better.

BTW, I've not yet done any homework to see if any of this has been done
before.  Now that I write it down, it seems like such a simple premise, so
maybe it has been done many times.  I should probably have researched it
before banging out an implementation.  Maybe there's a better way already
discovered.

> Do I also understand that we neither have a superset nor a subset,  
> just some fuzzyset?

For expected tokens reported in syntax error messages, that's correct.  
It's unsettling, isn't it?

> > so that's
> > what I looked at.
>
> I can send ugrammar.y privately, if you wish.

I'd like to see that.  Thanks!

Re: several messages

by Akim Demaille-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Le 8 sept. 2009 à 01:42, Joel E. Denny a écrit :

>> My grammar has:
>> - 185 rules
>> - 104 terminals
>> - 58 nonterminals
>
> How many S/R and R/R conflicts?  I'm also interested in conflicts  
> that are
> resolved by prec/assoc declarations.

There are no remaining conflicts.  The updated log has the figures on  
solved conflicts:

- 1143 in lalr/ielr
- 64650 in full lr

> It's good to see that the IELR computation time is barely worse than  
> LALR.
> Because IELR didn't split any states, the output files are of course
> exactly the same.  However, that also means there may not have been  
> much
> for IELR to do, and so the timings may not reveal much about IELR's
> complexity.  The number of conflicts would provide some insight  
> here...
> but maybe I'm getting off topic a little.
>
> (By the way, I've been meaning to add a test case for Bison that makes
> sure the LALR and IELR parser tables are exactly the same for
> parse-gram.y.  If they ever become different, we'll need to fix
> parse-gram.y or start using IELR.)

Good point :)


>> canonical-lr-all
>>
>> Execution times (seconds)
>> IELR(1) Phase 3       :   0.33 (10%) usr   0.01 (10%) sys   0.00  
>> ( 0%) wall
>> IELR(1) Phase 4       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00  
>> ( 0%) wall
>> conflicts             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00  
>> ( 0%) wall
>> parser action tables  :   2.19 (65%) usr   0.01 (10%) sys 128.00  
>> (100%) wall
>> outputing parser      :   0.16 ( 5%) usr   0.02 (20%) sys   0.00  
>> ( 0%) wall
>> running m4            :   0.65 (19%) usr   0.05 (50%) sys   0.00  
>> ( 0%) wall
>> freeing               :   0.01 ( 0%) usr   0.01 (10%) sys   0.00  
>> ( 0%) wall
>> TOTAL                 :   3.36             0.10           128.00
>> Bison 3.49s real  3.37s user  0.11s system
>> G++   7.85s real  6.76s user  0.76s system
>> state 12941
>> 2.7M /tmp/ugrammar.cc
>> 52K /tmp/ugrammar.hh
>> 1.6M /tmp/ugrammar.o
>> 25M /tmp/ugrammar.output
>
> I'm glad to see that most of the extra time does not fall in the  
> original
> parser table computation.  It just takes a really long time to  
> compress
> and transform those huge tables into the output code.

Yes, indeed.

>> canonical-lr-consistent
>>
>> Execution times (seconds)
>> IELR(1) Phase 3       :   0.36 ( 2%) usr   0.01 ( 4%) sys   0.00  
>> ( 0%) wall
>> IELR(1) Phase 4       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00  
>> ( 0%) wall
>> conflicts             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00  
>> ( 0%) wall
>> parser action tables  :  16.38 (91%) usr   0.05 (18%) sys   0.00  
>> ( 0%) wall
>> outputing parser      :   0.34 ( 2%) usr   0.08 (29%) sys   0.00  
>> ( 0%) wall
>> running m4            :   0.89 ( 5%) usr   0.13 (46%) sys   0.00  
>> ( 0%) wall
>> freeing               :   0.02 ( 0%) usr   0.01 ( 4%) sys   0.00  
>> ( 0%) wall
>> TOTAL                 :  18.01             0.28             0.00
>> Bison 18.64s real  18.02s user  0.30s system
>> G++   10.63s real  9.15s user  0.96s system
>> state 12941
>> 5.6M /tmp/ugrammar.cc
>> 52K /tmp/ugrammar.hh
>> 2.3M /tmp/ugrammar.o
>> 27M /tmp/ugrammar.output
>
> Wow.  What a jump.  I can see why you don't want this.  I have to  
> admit
> that I haven't spent much time exploring the combination of  
> canonical-lr
> and different values of lr.default-reductions.  Maybe it's bad that I
> defaulted it to "accepting"?  I just didn't want anyone to argue  
> that they
> specified canonical LR but didn't get it.

I have no strong opinion about this.  After all, someone reading the  
doc about lr.type should be ready to read the section about lr.default-
reductions.  We have to warn them.  My figures are public, if you mean  
to use them in some form in the documentation, that would be fine with  
me.


>> canonical-lr-accepting
>>
>> Execution times (seconds)
>> reader                :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00  
>> ( 0%) wall
>> IELR(1) Phase 3       :   0.37 ( 1%) usr   0.01 ( 2%) sys   0.00  
>> ( 0%) wall
>> IELR(1) Phase 4       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00  
>> ( 0%) wall
>> conflicts             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00  
>> ( 0%) wall
>> parser action tables  :  46.08 (95%) usr   0.21 (34%) sys   0.00  
>> ( 0%) wall
>> outputing parser      :   0.61 ( 1%) usr   0.12 (20%) sys   0.00  
>> ( 0%) wall
>> running m4            :   1.28 ( 3%) usr   0.26 (43%) sys   0.00  
>> ( 0%) wall
>> freeing               :   0.01 ( 0%) usr   0.01 ( 2%) sys   0.00  
>> ( 0%) wall
>> TOTAL                 :  48.38             0.61             0.00
>> Bison 49.65s real  48.39s user  0.63s system
>> G++   13.04s real  11.08s user  1.03s system
>> state 12941
>> 8.5M /tmp/ugrammar.cc
>> 52K /tmp/ugrammar.hh
>> 3.2M /tmp/ugrammar.o
>> 31M /tmp/ugrammar.output
>
> Ouch.  Well, more motivation to steer clear of canonical LR.

:)


>> It would be nice to extend the *.output file with various metrics
>> about the grammar.
>
> Agreed.

Will add this to TODO.

>>>> The syntax error on "1 1" behaves as expected with both accepting  
>>>> and
>>>> consistent (i.e., mute since there are too many possibilities ;).
>>>
>>> If you could unmute it (by adding a few more possibilities in the
>>> skeleton), it would be interesting to see if canonical LR gives the
>>> same
>>> expected tokens list.
>
> I was interested in extending it temporarily to compare the expected
> tokens for your specific grammar.  Of course, I also would like to  
> see it
> fixed in general....

I'll keep this as another TODO item :)


>> One way out of the fixed arity would be to use "note: " lines, as  
>> Alex
>> suggested it (and I'm slowly changing my mind on this regard).
>
> I was thinking we were both only opposed to dropping the warning  
> label.
> I like the idea of a special function for formatting submessages.  
> I'm not
> so sure about the "note: " prefix though.

I changed my mind when I saw how current GCCs talk to me.

$ cat /tmp/foo.cc
void foo (float*) {}
void foo (int*){}

int
main()
{
   foo(1);
}
$ g++ -Wall /tmp/foo.cc
/tmp/foo.cc: In function 'int main()':
/tmp/foo.cc:7: error: call of overloaded 'foo(int)' is ambiguous
/tmp/foo.cc:1: note: candidates are: void foo(float*) <near match>
/tmp/foo.cc:2: note:                 void foo(int*) <near match>

and Emacs correctly understands this as a single error.


> However, for the generated parser rather than for bison, strings fed  
> to
> yyerror currently don't contain newlines.  I'm not sure how  
> important or
> how long-standing that convention has been.  If we move to a %error-
> report
> or something like that, that might be a good opportunity to break the
> convention... or to let the user take control.

I think we should still keep separated the action of building the  
error message, and "sending" the error message.  But indeed, maybe  
it's wrong.  It's clearly "nicer", more modular, but on the other hand  
it may push on the user useless memory-allocation issues that do not  
even exist if she uses directly fprintf in her %error-report.

Still...  I feel uncomfortable with the fusion of both.



>>> My suspicion would have been that disabling default reductions  
>>> almost
>>> always does more good than harm, but I think we need more  
>>> exploration.
>>
>> So if we can't trust static tables, should we actually try each
>> possible token one after the other, and see if one can be shifted at
>> some point?  Of course without touching the stack, that can be  
>> quite a
>> pain to write :(  But it should not be really different from our  
>> error-
>> recovery code.
>
> You're reading my mind.  :)  I think we've realized for a while that
> %error-verbose is not perfect.

Yes, I agree.

> However, there's one problem with your suggestion.  By the time the  
> syntax
> error is reported, it's too late.  The parser may have already  
> performed
> some default reductions

Ah!  Of course!

> Thus, immediately upon fetching the token (it's fine if the fetch  
> comes
> after performing some default reductions in consistent states), the  
> parser
> needs to determine whether the token can eventually be shifted.  
> That is,
> the parser first tries the parse without performing semantic actions,
> without manipulating the value and location stacks, and without  
> losing the
> original state stack.  If it successfully reaches a shift, it then  
> repeats
> the parse on all the normal stacks.  If unsuccessful instead, it
> immediately reports a syntax error and then repeats the exploration  
> for
> every token, as you suggested, so it can report what tokens were  
> expected.
> During each such exploration (during parsing or at error recovery),  
> the
> parser need not duplicate the normal parser state stack in order to  
> avoid
> losing the original context.  Instead, it keeps a temporary stack that
> records only the exploratory states that are not already recorded on  
> the
> normal parser stack.  It also keeps a pointer into the normal parser  
> stack
> in order to record the base of the temporary stack.  That pointer  
> may move
> during exploratory reductions.  I've implemented all this already in
> yacc.c, and it functions correctly so far.  I'm just tidying up memory
> management at the moment, and I need to do more testing.

That sounds exciting!


> At worst, LAC should no more than double the parsing time.  That is,  
> even
> though it's similar to GLR or backtracking, it tries only one possible
> parse on any token up to its shift before performing the parse for  
> real or
> reporting a syntax error.  (Also unlike GLR and backtracking, we still
> resolve conflicts statically.  The parser is still deterministic.)  My
> suspicion is that, given the time for file I/O, lexical analysis, and
> semantic actions, the total run time of the parser will not be  
> affected
> severely by this doubling.  Of course, building a syntax error message
> will take longer than double, but I'm not as worried about that.  
> When I
> finish implementing, I'll try collecting some statistics.

Now I see what you're using bench.pl for :)


> Another interesting point is that, with LAC, IELR should always  
> perform
> exactly the same semantic actions as canonical LR even for  
> syntactically
> incorrect sentences.

Nice feature!

>  Previously, IELR could only claim that for
> syntactically correct sentences.  (The same can be said for LALR as  
> long
> as there are no mysterious conflicts, but, unless there are no  
> conflicts
> at all, that's hard to verify without an IELR implementation anyway.)
> One can even imagine using LAC in conjunction with a push parser to
> implement, for example, tab completions.

I also had this in mind.


> Keep in mind that, with LAC, the only possible difference between
> lr.default-reductions=all or =consistent for any fixed value of  
> lr.type
> should be parser tables sizes and performance (that is, possibly  
> many more
> reductions to explore before permitting normal parsing), which would  
> be
> interesting to collect statistics on.

It would also be nice to explore the merits at runtime of the table  
compressions.  I remember that some other tools (maybe Yacc++) offers  
this feature.  Space is probably no longer an issue, so maybe there  
would be some added runtime performances to use raw tables.


> BTW, I've not yet done any homework to see if any of this has been  
> done
> before.  Now that I write it down, it seems like such a simple  
> premise, so
> maybe it has been done many times.  I should probably have  
> researched it
> before banging out an implementation.  Maybe there's a better way  
> already
> discovered.

It seems that indeed, many many results were obtained about LR  
parsers, but many of them are now somewhat forgotten.  Yet, there are  
so many papers out there that I don't know what's the right means to  
fetch only the relevant ones.

>> Do I also understand that we neither have a superset nor a subset,
>> just some fuzzyset?
>
> For expected tokens reported in syntax error messages, that's correct.
> It's unsettling, isn't it?

Yes :)

>> I can send ugrammar.y privately, if you wish.
>
> I'd like to see that.  Thanks!

Will do right afterwards.  Please, no dirty jokes about my grammar :)  
I wish it was nicer in many places...


Re: several messages

by twlevo :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, September 16, 2009 16:32, Akim Demaille wrote:

>>> It would be nice to extend the *.output file with various metrics
>>> about the grammar.
>>
>> Agreed.
>
> Will add this to TODO.
About this metrics feature: there is a patched bison version
with a metrics option added. it is at "Ambiguity detection in GNU Bison"
http://www.lsv.ens-cachan.fr/~schmitz/software
there are some test grammars and documentation and cannot
test this feature because the bison-2.3a+ambiguity.tar.bz2
does not compile for some reason.
the source for metrics is in metrics.c


[metrics.c]

/*
   Copyright (C) 2006
   Laboratoire I3S, Université de Nice-Sophia Antipolis & CNRS.

   This file is part of Bison, the GNU Compiler Compiler.

   Bison is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 2, or (at your option)
   any later version.

   Bison is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with Bison; see the file COPYING.  If not, write to
   the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
   Boston, MA 02110-1301, USA.  */
/**
 * \file metrics.c
 * \brief Grammar metrics computations.
 */
#include <config.h>
#include "system.h"
#include <math.h>
#include "gram.h"
#include "derives.h"
#include "metrics.h"
#include "sets.h"

/** Returns the McCabe complexity of the grammar.  In a bison grammar,
    this boils down to counting the number of alternance signs \c
    |.
    \pre #derives_counts was computed. */
inline static unsigned int
mcc (void)
{
  unsigned int i, ret = 0;

  for (i = 0; i < nvars; i++)
    ret += (derives_counts[i])? derives_counts[i] - 1: 0;

  return ret;
}

/** Compute the number of LR(1) positions.
    \pre #derives has been computed.
    \pre #follow_tokens has been computed. */
inline static unsigned int
lr1_npositions (void)
{
  unsigned int i, j, ret = 0;

  for (i = 0; i < nvars; i++)
    {
      unsigned int f = bitset_count (follow_tokens[i]);
      ret += 2 * f;
      for (j = 0; derives[i][j]; j++)
        ret += f * (rule_rhs_length (derives[i][j]) + 1);
    }

  return ret;
}

inline void
grammar_metrics (FILE *out)
{
  unsigned int nlr1 = lr1_npositions ();
  double npairs = (double)nlr1 * (double)(nlr1 + 1) * 2;
  fprintf (out, "Grammar metrics.\n");
  fprintf (out, "\tsize |G|:\t\t%u\n", nritems);
  fprintf (out, "\tterminals |T|:\t\t%u\n", ntokens);
  fprintf (out, "\tnonterminals |N|:\t%u\n", nvars);
  fprintf (out, "\tvocabulary |V|:\t\t%u\n", nsyms);
  fprintf (out, "\tproductions |P|:\t%u\n", nrules);
  fprintf (out, "\tnorm ||G||:\t\t%.2f\n", nritems * log (nsyms) / M_LN2);
  fprintf (out, "\tMcCabe complexity MCC:\t%u\n", mcc ());
  fprintf (out, "\taverage rhs length AVS:\t%.2f\n",
           (double)nritems / (double)nrules);
  fprintf (out, "\tLR(1) positions:\t%u\n", nlr1);
  fprintf (out, "\tmax LR(1) pairs:\t%.3e\n", npairs);
  fprintf (out, "\n");
}


Re: several messages

by Akim Demaille :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Le 16 sept. 2009 à 19:06, tys lefering a écrit :

> On Wed, September 16, 2009 16:32, Akim Demaille wrote:
>
>>>> It would be nice to extend the *.output file with various metrics
>>>> about the grammar.
>>>
>>> Agreed.
>>
>> Will add this to TODO.
> About this metrics feature: there is a patched bison version
> with a metrics option added. it is at "Ambiguity detection in GNU  
> Bison"
> http://www.lsv.ens-cachan.fr/~schmitz/software
> there are some test grammars and documentation and cannot
> test this feature because the bison-2.3a+ambiguity.tar.bz2
> does not compile for some reason.
> the source for metrics is in metrics.c
>
> <metrics.c>

Wow...  Indeed, that's very much like it, and much more!  We should  
definitely study the opportunity to include these changes (heuristics  
for ambiguity detection) in the main stream Bison.

Maybe 2.7?

We have really many branches going on currently.  I, at least, need to  
focus on 2.6.


Re: several messages

by Joel E. Denny-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Akim.

On Wed, 16 Sep 2009, Akim Demaille wrote:

> > How many S/R and R/R conflicts?  I'm also interested in conflicts that are
> > resolved by prec/assoc declarations.
>
> There are no remaining conflicts.  The updated log has the figures on solved
> conflicts:

What do you mean by the "updated log"?  Did I miss an attachment
somewhere?

> - 1143 in lalr/ielr
> - 64650 in full lr

Thanks.  This data demonstrates another motivation not to use canonical
LR: when you're debugging conflicts using the parser tables, you don't
want to deal with an order of magnitude more states or an order of
magnitude more conflicts.  This motivation is timeless in that computer
hardware improvements won't alleviate it.

> > It's good to see that the IELR computation time is barely worse than LALR.
> > Because IELR didn't split any states, the output files are of course
> > exactly the same.  However, that also means there may not have been much
> > for IELR to do, and so the timings may not reveal much about IELR's
> > complexity.  The number of conflicts would provide some insight here...
> > but maybe I'm getting off topic a little.

Using the grammar you sent me, I ran a script of mine to analyze the
canonical LR versus LALR tables, and it confirms that no parser actions
have changed for any viable prefix.  I say this just to give more
confidence that IELR is working correctly when it concludes that LALR is
sufficient for your grammar.

Also, all those conflicts indicate that IELR likely had much work to do
even though it ran nearly as quickly as just LALR.  I was able to actually
see some of the work it did by using '-Dlr.type=ielr --trace=ielr'.  The
summary at the end of the trace is the best indicator.  The rest of the
trace is a bit cryptic, I'm afraid.

> > (By the way, I've been meaning to add a test case for Bison that makes
> > sure the LALR and IELR parser tables are exactly the same for
> > parse-gram.y.  If they ever become different, we'll need to fix
> > parse-gram.y or start using IELR.)
>
> Good point :)

I pushed the patch below to master and branch-2.5.

> > Wow.  What a jump.  I can see why you don't want this.  I have to admit
> > that I haven't spent much time exploring the combination of canonical-lr
> > and different values of lr.default-reductions.  Maybe it's bad that I
> > defaulted it to "accepting"?  I just didn't want anyone to argue that they
> > specified canonical LR but didn't get it.
>
> I have no strong opinion about this.  After all, someone reading the doc about
> lr.type should be ready to read the section about lr.default-reductions.  We
> have to warn them.

That makes sense.  I still need to reorganize that documentation though.

> My figures are public, if you mean to use them in some
> form in the documentation, that would be fine with me.

Thanks.  Your figures are definitely interesting as a very bad case for
canonical LR.

> I changed my mind when I saw how current GCCs talk to me.
>
> $ cat /tmp/foo.cc
> void foo (float*) {}
> void foo (int*){}
>
> int
> main()
> {
>  foo(1);
> }
> $ g++ -Wall /tmp/foo.cc
> /tmp/foo.cc: In function 'int main()':
> /tmp/foo.cc:7: error: call of overloaded 'foo(int)' is ambiguous
> /tmp/foo.cc:1: note: candidates are: void foo(float*) <near match>
> /tmp/foo.cc:2: note:                 void foo(int*) <near match>
>
> and Emacs correctly understands this as a single error.

Ah, C++.  Well, that's a strong precedent.  I had only checked the C
compiler previously, so I missed it.

> > BTW, I've not yet done any homework to see if any of this has been done
> > before.  Now that I write it down, it seems like such a simple premise, so
> > maybe it has been done many times.  I should probably have researched it
> > before banging out an implementation.  Maybe there's a better way already
> > discovered.
>
> It seems that indeed, many many results were obtained about LR parsers, but
> many of them are now somewhat forgotten.  Yet, there are so many papers out
> there that I don't know what's the right means to fetch only the relevant
> ones.

I did do a quick search and saw that this issue has been discussed in
various forms for Bison.  I still would like to find some published papers
or complete implementations.  It's like looking for a needle in a
haystack.  Maybe I'll write comp.compilers for advice.

> > > I can send ugrammar.y privately, if you wish.
> >
> > I'd like to see that.  Thanks!
>
> Will do right afterwards.

Thanks for sending that!

> Please, no dirty jokes about my grammar :)  I wish
> it was nicer in many places...

I haven't spent much time reading the .y file itself yet.  I've just run
automated traces and analyses.  I'll let you know if I come up with any
jokes later.  :)

>From 43aabb70a95ecbd20c76797c53554641c3576db4 Mon Sep 17 00:00:00 2001
From: Joel E. Denny <jdenny@...>
Date: Sat, 26 Sep 2009 14:49:20 -0400
Subject: [PATCH] tests: check that parse-gram.y's IELR and LALR are identical.

* tests/atlocal.in (abs_top_srcdir): New shell variable.
* tests/regression.at (parse-gram.y: LALR = IELR): New test
group.
---
 ChangeLog           |    7 +++++++
 tests/atlocal.in    |    2 ++
 tests/regression.at |   21 +++++++++++++++++++++
 3 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index ca92292..f24be8a 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2009-09-26  Joel E. Denny  <jdenny@...>
+
+ tests: check that parse-gram.y's IELR and LALR are identical.
+ * tests/atlocal.in (abs_top_srcdir): New shell variable.
+ * tests/regression.at (parse-gram.y: LALR = IELR): New test
+ group.
+
 2009-09-19  Alex Rozenman  <rozenman@...>
 
  Keep sub-messages aligned. Fix strings for translation.
diff --git a/tests/atlocal.in b/tests/atlocal.in
index 91ba674..2e46329 100644
--- a/tests/atlocal.in
+++ b/tests/atlocal.in
@@ -42,3 +42,5 @@ CONF_JAVA='@CONF_JAVA@'
 
 # We need egrep.
 : ${EGREP='@EGREP@'}
+
+abs_top_srcdir='@abs_top_srcdir@'
diff --git a/tests/regression.at b/tests/regression.at
index 6bfc77e..2482189 100644
--- a/tests/regression.at
+++ b/tests/regression.at
@@ -1237,3 +1237,24 @@ AT_COMPILE([[input]])
 AT_PARSER_CHECK([[./input]])
 
 AT_CLEANUP
+
+
+
+## --------------------------- ##
+## parse-gram.y: LALR = IELR.  ##
+## --------------------------- ##
+
+# If parse-gram.y's LALR and IELR parser tables ever begin to differ, we
+# need to fix parse-gram.y or start using IELR.
+
+AT_SETUP([[parse-gram.y: LALR = IELR]])
+
+# Avoid tests/bison's dark magic by processing a local copy of the
+# grammar.  Avoid differences in synclines by telling bison that the
+# output files have the same name.
+cp $abs_top_srcdir/src/parse-gram.y input.y
+AT_BISON_CHECK([[-o input.c -Dlr.type=lalr input.y && mv input.c lalr.c]])
+AT_BISON_CHECK([[-o input.c -Dlr.type=ielr input.y && mv input.c ielr.c]])
+AT_CHECK([[diff -u lalr.c ielr.c]])
+
+AT_CLEANUP
--
1.5.4.3




Re: several messages

by Joel E. Denny-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 26 Sep 2009, Joel E. Denny wrote:

> Subject: [PATCH] tests: check that parse-gram.y's IELR and LALR are identical.
>
> * tests/atlocal.in (abs_top_srcdir): New shell variable.
> * tests/regression.at (parse-gram.y: LALR = IELR): New test
> group.

To clean that up, I pushed the following patch to master and branch-2.5.

>From d8f68fc29536fa1c1e7b1b200b1e8088762c3e93 Mon Sep 17 00:00:00 2001
From: Joel E. Denny <jdenny@...>
Date: Sun, 27 Sep 2009 14:37:00 -0400
Subject: [PATCH] tests: don't abuse AT_BISON_CHECK.

* tests/regression.at (parse-gram.y: LALR = IELR): Move
additional shell commands outside of AT_BISON_CHECK.
---
 ChangeLog           |    6 ++++++
 tests/regression.at |    8 +++++---
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index f24be8a..a0b4314 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2009-09-27  Joel E. Denny  <jdenny@...>
+
+ tests: don't abuse AT_BISON_CHECK.
+ * tests/regression.at (parse-gram.y: LALR = IELR): Move
+ additional shell commands outside of AT_BISON_CHECK.
+
 2009-09-26  Joel E. Denny  <jdenny@...>
 
  tests: check that parse-gram.y's IELR and LALR are identical.
diff --git a/tests/regression.at b/tests/regression.at
index 2482189..cfc071e 100644
--- a/tests/regression.at
+++ b/tests/regression.at
@@ -1252,9 +1252,11 @@ AT_SETUP([[parse-gram.y: LALR = IELR]])
 # Avoid tests/bison's dark magic by processing a local copy of the
 # grammar.  Avoid differences in synclines by telling bison that the
 # output files have the same name.
-cp $abs_top_srcdir/src/parse-gram.y input.y
-AT_BISON_CHECK([[-o input.c -Dlr.type=lalr input.y && mv input.c lalr.c]])
-AT_BISON_CHECK([[-o input.c -Dlr.type=ielr input.y && mv input.c ielr.c]])
+[cp $abs_top_srcdir/src/parse-gram.y input.y]
+AT_BISON_CHECK([[-o input.c -Dlr.type=lalr input.y]])
+[mv input.c lalr.c]
+AT_BISON_CHECK([[-o input.c -Dlr.type=ielr input.y]])
+[mv input.c ielr.c]
 AT_CHECK([[diff -u lalr.c ielr.c]])
 
 AT_CLEANUP
--
1.5.4.3