Recent /bin/sh breakage

View: New views
7 Messages — Rating Filter:   Alert me  

Recent /bin/sh breakage

by Andreas Gustafsson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi all,

I'm running daily automated tests of NetBSD-current which consist of
building an i386 release and then installing and booting it in a
virtual machine using Anita (misc/py-anita in pkgsrc).

As of CVS date 2009.10.30.15.09.24, the system succesfully installs,
but the installed system doesn't boot:

   root on md0a dumps on md0b
   root file system type: ffs
   warning: no /dev/console
   UVM: pid 2 (sh), uid 0 killed: out of swap
   panic: init died (signal 0, exit 11)
   fatal breakpoint trap in supervisor mode
   trap type 1 code 0 eip c019ecb4 cs 8 eflags 246 cr2 8163110 ilevel 0
   Stopped in pid 1.1 (init) at    0xc019ecb4:     popl    %ebp
   db{0}>

This breakage occurred some time between CVS dates 2009.10.29.14.49.40
and 2009.10.30.15.09.24, but narrowing it down closer than that is
difficult because in most of the intervening period, -current didn't
even build.
--
Andreas Gustafsson, gson@...

Re: Recent /bin/sh breakage

by Andreas Gustafsson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

A couple of weeks ago, I wrote:
> I'm running daily automated tests of NetBSD-current which consist of
> building an i386 release and then installing and booting it in a
> virtual machine using Anita (misc/py-anita in pkgsrc).
>
> As of CVS date 2009.10.30.15.09.24, the system succesfully installs,
> but the installed system doesn't boot

Sorry, I misread the log file - it actually crashes shortly after
booting the INSTALL kernel, not while booting the installed system.

>    root on md0a dumps on md0b
>    root file system type: ffs
>    warning: no /dev/console
>    UVM: pid 2 (sh), uid 0 killed: out of swap

That was using "qemu -m 32", emulating 32 MB of memory.  If I increase
the amount of memory to 40 MB, the installation gets a bit further,
and with 50 MB, the installation is successful.

A build from 2009.10.29.14.49.40 sources installs with just 24 MB of
memory, so it looks like something gained a whole lot of bloat during
those 24-odd hours (not necessarily /bin/sh itself, though).

It's still broken as of 2009.11.10.18.19.46 (with 32 MB).
--
Andreas Gustafsson, gson@...

Re: Recent /bin/sh breakage

by Martin Husemann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Nov 11, 2009 at 10:25:34PM +0200, Andreas Gustafsson wrote:
> A build from 2009.10.29.14.49.40 sources installs with just 24 MB of
> memory, so it looks like something gained a whole lot of bloat during
> those 24-odd hours (not necessarily /bin/sh itself, though).

uarea swap support got removed, maybe that makes this difference?

Martin

Re: Recent /bin/sh breakage

by Andreas Gustafsson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Martin Husemann wrote:
> uarea swap support got removed, maybe that makes this difference?

I don't see any uarea related commits during the time period in case.
This should be the full list of the commits between the last working
version and the first broken one:

  2009.10.29.17.10.32 christos src/sys/ufs/lfs/lfs.h,v 1.129
  2009.10.29.17.10.32 christos src/sys/ufs/lfs/lfs_vnops.c,v 1.222
  2009.10.29.17.16.40 christos src/tools/lex/Makefile,v 1.8
  2009.10.29.17.17.12 christos src/external/bsd/flex/dist/Attic/parse.c,v 1.3
  2009.10.29.17.17.12 christos src/external/bsd/flex/dist/Attic/parse.h,v 1.2
  2009.10.29.17.17.12 christos src/external/bsd/flex/dist/initparse.c,v 1.1
  2009.10.29.17.17.12 christos src/external/bsd/flex/dist/initparse.h,v 1.1
  2009.10.29.17.17.33 christos src/external/bsd/flex/bin/Makefile,v 1.5
  2009.10.29.18.20.11 eeh src/sys/ufs/lfs/lfs_vfsops.c,v 1.279
  2009.10.29.21.03.59 christos src/external/bsd/byacc/dist/defs.h,v 1.3
  2009.10.29.21.03.59 christos src/external/bsd/byacc/dist/output.c,v 1.3
  2009.10.29.21.03.59 christos src/external/bsd/byacc/dist/reader.c,v 1.3
  2009.10.29.21.03.59 christos src/external/bsd/byacc/dist/skeleton.c,v 1.5
  2009.10.29.21.11.57 christos src/external/bsd/byacc/dist/output.c,v 1.4
  2009.10.30.00.30.20 christos src/tools/lex/Makefile,v 1.9
  2009.10.30.00.53.29 christos src/sys/ufs/lfs/lfs_vnops.c,v 1.223
  2009.10.30.01.40.45 joerg src/usr.bin/m4/m4.1,v 1.21
  2009.10.30.01.53.02 christos src/external/bsd/byacc/dist/config_h.in,v 1.3
  2009.10.30.01.57.48 christos src/sys/dev/pci/pm2reg.h,v 1.3
  2009.10.30.10.57.40 njoly src/sys/compat/linux/arch/amd64/syscalls.master,v 1.33
  2009.10.30.10.58.15 njoly src/sys/compat/linux/arch/amd64/linux_syscallargs.h,v 1.35
  2009.10.30.10.58.15 njoly src/sys/compat/linux/arch/amd64/linux_syscall.h,v 1.35
  2009.10.30.10.58.15 njoly src/sys/compat/linux/arch/amd64/linux_syscalls.c,v 1.35
  2009.10.30.10.58.15 njoly src/sys/compat/linux/arch/amd64/linux_sysent.c,v 1.35
  2009.10.30.15.05.54 he src/sys/arch/sparc/sparc/pmap.c,v 1.336
  2009.10.30.15.09.24 uebayasi src/usr.bin/tn3270/tools/mkmake/mkmake.y,v 1.14

It appears that the sh process running out of swap is the one forked
by init to create the tmpfs /dev.

I tried the experiment of installing the 2009.10.30.15.09.24
version (with 64 M of memory to make installation succeed), booting
the installed system, adding the line "ps -glaxw" at the end of
/dev/MAKEDEV, and then running "sh MAKEDEV -MM init".  Sure enough,
the VSZ of the sh process is huge:

  # sh MAKEDEV -MM init
  Created tmpfs /dev (1490944 byte, 2880 inodes)
  UID PID PPID   CPU PRI NI   VSZ   RSS WCHAN   STAT TTY      TIME COMMAND
    0   0    0     0 125  0     0 16504 uvm     DKl  ?     0:00.68 [system]
    0   1    0  9105  83  0  3044   756 wait    Is   ?     0:00.12 init
    0 146    1     0  85  0  4936  1152 kqueue  Ss   ?     0:00.08 /usr/sbin/syslogd -s
    0 324    1     0  85  0  3012   788 nanoslp Ss   ?     0:00.03 /usr/sbin/cron
    0 330    1  7152  84  0  3092   788 kqueue  Is   ?     0:00.02 /usr/sbin/inetd -l
    0 337    1  1117  85  0  5828  1680 wait    Is   tty00 0:00.20 login
    0 344  337     0  85  0  3068   952 wait    S    tty00 0:00.15 -sh
    0 471  344 13031  85  0 27644 23500 wait    S+   tty00 0:01.15 sh MAKEDEV -MM init
    0 541  471 13031  42  0  3064   852 -       O+   tty00 0:00.01 ps -glaxw
    0 348    1  1218  85  0  3024   820 ttyraw  Is+  ttyE1 0:00.01 /usr/libexec/getty Pc ttyE1
    0 339    1  1218  85  0  3024   828 ttyraw  Is+  ttyE2 0:00.02 /usr/libexec/getty Pc ttyE2
    0 346    1  1356  85  0  3024   828 ttyraw  Is+  ttyE3 0:00.01 /usr/libexec/getty Pc ttyE3

I suspect the lex/yacc changes...
--
Andreas Gustafsson, gson@...

Re: Recent /bin/sh breakage

by David Laight :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Nov 11, 2009 at 11:41:20PM +0200, Andreas Gustafsson wrote:
>
>   # sh MAKEDEV -MM init
>   Created tmpfs /dev (1490944 byte, 2880 inodes)
>   UID PID PPID   CPU PRI NI   VSZ   RSS WCHAN   STAT TTY      TIME COMMAND
>     0 471  344 13031  85  0 27644 23500 wait    S+   tty00 0:01.15 sh MAKEDEV -MM init

Something must be leaking badly :-)

        David

--
David Laight: david@...

Re: Recent /bin/sh breakage

by Andreas Gustafsson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

David Laight wrote:
> >   UID PID PPID   CPU PRI NI   VSZ   RSS WCHAN   STAT TTY      TIME COMMAND
> >     0 471  344 13031  85  0 27644 23500 wait    S+   tty00 0:01.15 sh MAKEDEV -MM init
>
> Something must be leaking badly :-)

Indeed, and I think I have found the reason for the leak.  The sh
arithmetic expression YACC parser contains a return statement in a
parser action; this is a bad idea because it causes yyparse() to
return without freeing the malloc'ed parser state.  I will commit the
following fix shortly.

Index: src/bin/sh/arith.y
===================================================================
RCS file: /bracket/repo/src/bin/sh/arith.y,v
retrieving revision 1.18
diff -u -r1.18 arith.y
--- src/bin/sh/arith.y 25 Mar 2007 06:29:26 -0000 1.18
+++ src/bin/sh/arith.y 12 Nov 2009 20:00:34 -0000
@@ -83,7 +83,6 @@
  * the desired result elsewhere.
  */
  arith_result = $1;
- return 0;
  }
  ;
 
--
Andreas Gustafsson, gson@...

Re: Recent /bin/sh breakage

by Christos Zoulas :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Nov 13,  3:39pm, gson@... (Andreas Gustafsson) wrote:
-- Subject: Re: Recent /bin/sh breakage

| David Laight wrote:
| > >   UID PID PPID   CPU PRI NI   VSZ   RSS WCHAN   STAT TTY      TIME COMMAND
| > >     0 471  344 13031  85  0 27644 23500 wait    S+   tty00 0:01.15 sh MAKEDEV -MM init
| >
| > Something must be leaking badly :-)
|
| Indeed, and I think I have found the reason for the leak.  The sh
| arithmetic expression YACC parser contains a return statement in a
| parser action; this is a bad idea because it causes yyparse() to
| return without freeing the malloc'ed parser state.  I will commit the
| following fix shortly.
|
| Index: src/bin/sh/arith.y
| ===================================================================
| RCS file: /bracket/repo/src/bin/sh/arith.y,v
| retrieving revision 1.18
| diff -u -r1.18 arith.y
| --- src/bin/sh/arith.y 25 Mar 2007 06:29:26 -0000 1.18
| +++ src/bin/sh/arith.y 12 Nov 2009 20:00:34 -0000
| @@ -83,7 +83,6 @@
|   * the desired result elsewhere.
|   */
|   arith_result = $1;
| - return 0;
|   }
|   ;
|  

Good catch! That must be a result of upgrading yacc and using malloc
on each yyparse call instead of re-using memory so that the parser can
be re-entrant!

Very nice!

christos