crash in rn_walktree on reboot/shutdown

View: New views
4 Messages — Rating Filter:   Alert me  

crash in rn_walktree on reboot/shutdown

by Christoph Egger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



Hi!

On NetBSD/i386 -current, I get a crash on every reboot/shutdown.

uvm_fault(0xcb3815cc, 0, 1) -> 0xe
fatal page fault in supervisor mode
trap type 6 code 0 eip c0501ff5 cs 8 eflags 286 cr2 8 ilevel 6
kernel: supervisor trap page fault, code=0
Stopped in pid 477.1 (reboot) at        netbsd:rn_walktree+0x65:        cmpw
$
0,0x8(%esi)
db{0}> bt
rn_walktree(c11b9980,c05323f0,cb52d9e4,6,0,c11c8800,cb52db6c,0,c0300f20,caac8010
) at netbsd:rn_walktree+0x65
rt_walktree(2,c0300f20,caac8010,0,0,1,cb52da4c,c068be1b,0,0) at
netbsd:rt_walktr
ee+0x32
if_detach(caac8010,ffffffff,10,ca876720,ca876732,cb366000,caac8d00,caac8000,caac
8d00,c097bae0) at netbsd:if_detach+0xaf
tlp_detach(caac8000,0,10,c098e500,c09cca60,caac8d00,cb52dbec,c058fa57,caac8d00,4
) at netbsd:tlp_detach+0x79
tlp_pci_detach(caac8d00,4,cb52dbec,c059d8b6,c058e2e6,ca877200,2d,caac8d00,1,cb50
f7e0) at netbsd:tlp_pci_detach+0x1b
config_detach(caac8d00,4,c058fcff,c0a1b160,0,0,cb52dc3c,c0482e89,0,0) at
netbsd:
config_detach+0x157
config_detach_all(0,0,0,1,23,cb50f7e0,cb52dcdc,0,0,0) at
netbsd:config_detach_al
l+0x86
cpu_reboot(0,0,cb52dc7c,c046299c,ca426f00,cb382b40,cb50f7e0,cb52dda0,2a,cb92d780
) at netbsd:cpu_reboot+0x1b9
sys_reboot(cb50f7e0,cb52dd00,cb52dd28,cb52dd00,cb3815cc,cb4b1774,d0,0,0,bfbfeb28
) at netbsd:sys_reboot+0x52
syscall(cb52dd48,b3,ab,1f,1f,0,0,bfbfeb28,2,256) at netbsd:syscall+0xb9
db{0}> show reg
ds          0x10
es          0x10
fs          0x30
gs          0x10
edi         0xcb52d9e4
esi         0
ebp         0xcb52d9bc
ebx         0
edx         0xc117e090
ecx         0xc117e090
eax         0xc117e120
eip         0xc0501ff5  rn_walktree+0x65
cs          0x8
eflags      0x286
esp         0xcb52d9a4
ss          0x10
netbsd:rn_walktree+0x65:        cmpw    $0,0x8(%esi)

Re: crash in rn_walktree on reboot/shutdown

by David Young :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 28, 2009 at 08:07:08AM +0100, Christoph Egger wrote:
>
>
> Hi!
>
> On NetBSD/i386 -current, I get a crash on every reboot/shutdown.

I occasionally see the same crash as you report, but I have not been
able to replicate it reliably.

IIRC, rn_walktree() walks the binary trie in depth-first order, 0, 1, 2,
...:

       6
       /\
      /  \
     /    \
    /      \
   2        5
  /\        /\
 /  \      /  \
0    1    3    4

and invokes the callback on each *leaf* node.

rn_walktree() precomputes the next leaf before invoking the callback on
the current leaf, because the callback may delete the leaf.

In the if_detach() case, the callback is if_rt_walktree(), which will
delete an leaf/rtentry if it references the ifnet that if_detach() is
destroying.

I suspect that an occasional side-effect of if_rt_walktree() deleting
a leaf is to delete the next leaf that rn_walktree() precomputed.  For
example, rtflushclone() may delete one or more leaves when it calls
rt_walktree() to delete cloned routes.  If the next leaf was deleted,
its pointers rn_l/rn_r/rn_p may not respect the radix_node contract when
rn_walktree() visits; a crash is possible.

What can you do to fix this?  I suggest rewriting if_rt_walktree(), and
the following code in if_detach(), to record as many rtentry's to delete
as will fit in some array, delete them, and repeat until there are no
more rtentry's to delete.

        /* Walk the routing table looking for stragglers. */
        for (i = 0; i <= AF_MAX; i++)
                (void)rt_walktree(i, if_rt_walktree, ifp);

Perhaps rtflushclone() can/should be rewritten in similar fashion?

Dave

--
David Young             OJC Technologies
dyoung@...      Urbana, IL * (217) 278-3933

Re: crash in rn_walktree on reboot/shutdown

by Matthias Drochner-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


dyoung@... said:
> I suspect that an occasional side-effect of if_rt_walktree() deleting
> a leaf is to delete the next leaf that rn_walktree() precomputed.

Here is what I'm having in my tree for many months. I did this
originally to compensate for some bug introduced elsewhere,
but while it certainly has the potential to hide bugs it
makes the code more robust.

best regards
Matthias



------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

#
# old_revision [e7e99d7bd1a7ec16f632f9d489a3bfda6ac02d4d]
#
# patch "sys/net/if.c"
#  from [2f764544bedf58fa053320c517941791147f12c3]
#    to [f1d6674f5866d890c5126dcc93b75eb0116590bf]
#
# patch "sys/net/radix.c"
#  from [f18791be726cd29df4eadf23233c63ab529b8f29]
#    to [da46876ad1655520145e39dc80adf1bb7e32d7c0]
#
============================================================
--- sys/net/if.c 2f764544bedf58fa053320c517941791147f12c3
+++ sys/net/if.c f1d6674f5866d890c5126dcc93b75eb0116590bf
@@ -904,7 +904,7 @@ if_rt_walktree(struct rtentry *rt, void
  if (error != 0)
  printf("%s: warning: unable to delete rtentry @ %p, "
     "error = %d\n", ifp->if_xname, rt, error);
- return 0;
+ return (-1);
 }
 
 /*
============================================================
--- sys/net/radix.c f18791be726cd29df4eadf23233c63ab529b8f29
+++ sys/net/radix.c da46876ad1655520145e39dc80adf1bb7e32d7c0
@@ -987,6 +987,7 @@ rn_walktree(
  * while applying the function f to it, so we need to calculate
  * the successor node in advance.
  */
+again:
  rn = rn_walkfirst(h->rnh_treetop, NULL, NULL);
  for (;;) {
  base = rn;
@@ -994,8 +995,13 @@ rn_walktree(
  /* Process leaves */
  while ((rn = base) != NULL) {
  base = rn->rn_dupedkey;
- if (!(rn->rn_flags & RNF_ROOT) && (error = (*f)(rn, w)))
- return error;
+ if (!(rn->rn_flags & RNF_ROOT)
+    && (error = (*f)(rn, w))) {
+ if (error == -1)
+ goto again;
+ else
+ return error;
+ }
  }
  rn = next;
  if (rn->rn_flags & RNF_ROOT)

Re: crash in rn_walktree on reboot/shutdown

by Christoph Egger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Matthias Drochner wrote:

> dyoung@... said:
>> I suspect that an occasional side-effect of if_rt_walktree() deleting
>> a leaf is to delete the next leaf that rn_walktree() precomputed.
>
> Here is what I'm having in my tree for many months. I did this
> originally to compensate for some bug introduced elsewhere,
> but while it certainly has the potential to hide bugs it
> makes the code more robust.
>
> best regards
> Matthias

I tested your patch. I can't reproduce the crash with it.

Christoph