Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly caused by netem)

View: New views
13 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 - 3 | Next >

Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

by Thomas Gleixner :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> On Thu, Jul 09, 2009 at 12:23:17AM +0200, Andres Freund wrote:
> ...
> > Unfortunately this just yields the same backtraces during softlockup and not
> > earlier.
> > I did not test without lockdep yet, but that should not have stopped the BUG
> > from appearing, right?
>
> Since it looks like hrtimers now, these changes in timers shouldn't
> matter. Let's wait for new ideas.

Some background:

Up to 2.6.30 hrtimer_start() and add_timer() enqueue (hr)timers on the
CPU on which the functions are called. There is one exception when the
timer callback is currently running on another CPU then it is enqueued
on that other CPU.

The migration patches change that behaviour and enqeue the timer on
the nohz.idle_balancer CPU when parts of the system are idle.

With the migration code disabled (via sysctl or the #if 0 patch) the
timer is always enqeued on the same CPU, i.e. you get the 2.6.30
behaviour back.

As you found out it is probably related to hrtimers. Checking the
network code the only hrtimer users are in net/sched/sch_api.c and
net/sched/sch_cbq.c . There is some in net/can as well, but that's
probably irrelevant for the problem at hand.

I'm not familiar with that code, so I have no clue which problems
might pop up due to enqueueing the timer on another CPU, but there is
one pretty suspicios code sequence in cbq_ovl_delay()

        expires = ktime_set(0, 0);
        expires = ktime_add_ns(expires, PSCHED_US2NS(sched));
        if (hrtimer_try_to_cancel(&q->delay_timer) &&
            ktime_to_ns(ktime_sub(
                        hrtimer_get_expires(&q->delay_timer),
                        expires)) > 0)
                hrtimer_set_expires(&q->delay_timer, expires);
        hrtimer_restart(&q->delay_timer);

So we set the expiry value of the timer only when the timer was active
(hrtimer_try_to_cancel() returned != 0) and the new expiry time is
before the expiry time which was in the active timer. If the timer was
inactive we start the timer with the last expiry time which is
probably already in the past.

I'm quite sure that this is not causing the migration problem, because
we do not enqueue it on a different CPU when the timer is already
expired.

For completeness: hrtimer_try_to_cancel() can return -1 when the timer
callback is running. So in that case we also fiddle with the expiry
value and restart the timer while the callback code itself might do
the same. There is no serializiation of that code and the callback it
seems. The watchdog timer callback in sch_api.c is not serialized
either.

There is another oddity in cbq_undelay() which is the hrtimer callback
function:

        if (delay) {
                ktime_t time;

                time = ktime_set(0, 0);
                time = ktime_add_ns(time, PSCHED_TICKS2NS(now + delay));
                hrtimer_start(&q->delay_timer, time, HRTIMER_MODE_ABS);

The canocial way to restart a hrtimer from the callback function is to
set the expiry value and return HRTIMER_RESTART.

        }

        sch->flags &= ~TCQ_F_THROTTLED;
        __netif_schedule(qdisc_root(sch));
        return HRTIMER_NORESTART;

Again, this should not cause the timer to be enqueued on another CPU
as we do not enqueue on a different CPU when the callback is running,
but see above ...

I have the feeling that the code relies on some implicit cpu
boundness, which is not longer guaranteed with the timer migration
changes, but that's a question for the network experts.

Thanks,

        tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

by Jarek Poplawski-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jul 09, 2009 at 12:31:53PM +0200, Thomas Gleixner wrote:

> On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> > On Thu, Jul 09, 2009 at 12:23:17AM +0200, Andres Freund wrote:
> > ...
> > > Unfortunately this just yields the same backtraces during softlockup and not
> > > earlier.
> > > I did not test without lockdep yet, but that should not have stopped the BUG
> > > from appearing, right?
> >
> > Since it looks like hrtimers now, these changes in timers shouldn't
> > matter. Let's wait for new ideas.
>
> Some background:
...

> There is another oddity in cbq_undelay() which is the hrtimer callback
> function:
>
> if (delay) {
> ktime_t time;
>
> time = ktime_set(0, 0);
> time = ktime_add_ns(time, PSCHED_TICKS2NS(now + delay));
> hrtimer_start(&q->delay_timer, time, HRTIMER_MODE_ABS);
>
> The canocial way to restart a hrtimer from the callback function is to
> set the expiry value and return HRTIMER_RESTART.

OK, that's for later because we didn't use cbq here.

>
> }
>
> sch->flags &= ~TCQ_F_THROTTLED;
> __netif_schedule(qdisc_root(sch));
> return HRTIMER_NORESTART;
>
> Again, this should not cause the timer to be enqueued on another CPU
> as we do not enqueue on a different CPU when the callback is running,
> but see above ...
>
> I have the feeling that the code relies on some implicit cpu
> boundness, which is not longer guaranteed with the timer migration
> changes, but that's a question for the network experts.

As a matter of fact, I've just looked at this __netif_schedule(),
which really is cpu bound, so you might be 100% right.

Thanks for your help,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

by Thomas Gleixner :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> >
> > I have the feeling that the code relies on some implicit cpu
> > boundness, which is not longer guaranteed with the timer migration
> > changes, but that's a question for the network experts.
>
> As a matter of fact, I've just looked at this __netif_schedule(),
> which really is cpu bound, so you might be 100% right.

So the watchdog is the one which causes the trouble. The patch below
should fix this.

Thanks,

        tglx
---

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 24d17ce..fbe554f 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -485,7 +485,7 @@ void qdisc_watchdog_schedule(struct qdisc_watchdog *wd, psched_time_t expires)
  wd->qdisc->flags |= TCQ_F_THROTTLED;
  time = ktime_set(0, 0);
  time = ktime_add_ns(time, PSCHED_TICKS2NS(expires));
- hrtimer_start(&wd->timer, time, HRTIMER_MODE_ABS);
+ hrtimer_start(&wd->timer, time, HRTIMER_MODE_ABS_PINNED);
 }
 EXPORT_SYMBOL(qdisc_watchdog_schedule);
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

by Jarek Poplawski-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jul 09, 2009 at 02:03:50PM +0200, Thomas Gleixner wrote:

> On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> > >
> > > I have the feeling that the code relies on some implicit cpu
> > > boundness, which is not longer guaranteed with the timer migration
> > > changes, but that's a question for the network experts.
> >
> > As a matter of fact, I've just looked at this __netif_schedule(),
> > which really is cpu bound, so you might be 100% right.
>
> So the watchdog is the one which causes the trouble. The patch below
> should fix this.

I hope so. On the other hand it seems it should work with this
migration yet, so it probably needs additional debugging.

Thanks,
Jarek P.

> ---
>
> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
> index 24d17ce..fbe554f 100644
> --- a/net/sched/sch_api.c
> +++ b/net/sched/sch_api.c
> @@ -485,7 +485,7 @@ void qdisc_watchdog_schedule(struct qdisc_watchdog *wd, psched_time_t expires)
>   wd->qdisc->flags |= TCQ_F_THROTTLED;
>   time = ktime_set(0, 0);
>   time = ktime_add_ns(time, PSCHED_TICKS2NS(expires));
> - hrtimer_start(&wd->timer, time, HRTIMER_MODE_ABS);
> + hrtimer_start(&wd->timer, time, HRTIMER_MODE_ABS_PINNED);
>  }
>  EXPORT_SYMBOL(qdisc_watchdog_schedule);
>  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

by Thomas Gleixner :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 9 Jul 2009, Jarek Poplawski wrote:

> On Thu, Jul 09, 2009 at 02:03:50PM +0200, Thomas Gleixner wrote:
> > On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> > > >
> > > > I have the feeling that the code relies on some implicit cpu
> > > > boundness, which is not longer guaranteed with the timer migration
> > > > changes, but that's a question for the network experts.
> > >
> > > As a matter of fact, I've just looked at this __netif_schedule(),
> > > which really is cpu bound, so you might be 100% right.
> >
> > So the watchdog is the one which causes the trouble. The patch below
> > should fix this.
>
> I hope so. On the other hand it seems it should work with this
> migration yet, so it probably needs additional debugging.

Right. I just provided the patch to narrow down the problem, but
please test the fix of the hrtimer migration code which I sent out a
bit earlier: http://lkml.org/lkml/2009/7/9/150

It fixes a possible endless loop in the timer code which is related to
the migration changes. Looking at the backtraces of the spinlock
lockup I think that is what you hit.

       spin_lock(root_lock);
       qdisc_run(q);
         __qdisc_run(q);
           dequeue_skb(q);
             q->dequeue(q);
               qdisc_watchdog_schedule();
                 hrtimer_start();
                   switch_hrtimer_base(); <- loops forever

Now the other CPU is stuck in dev_xmit() spin_lock(root_lock)

Thanks,

        tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

by Jarek Poplawski-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jul 09, 2009 at 04:15:28PM +0200, Thomas Gleixner wrote:

> On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> > On Thu, Jul 09, 2009 at 02:03:50PM +0200, Thomas Gleixner wrote:
> > > On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> > > > >
> > > > > I have the feeling that the code relies on some implicit cpu
> > > > > boundness, which is not longer guaranteed with the timer migration
> > > > > changes, but that's a question for the network experts.
> > > >
> > > > As a matter of fact, I've just looked at this __netif_schedule(),
> > > > which really is cpu bound, so you might be 100% right.
> > >
> > > So the watchdog is the one which causes the trouble. The patch below
> > > should fix this.
> >
> > I hope so. On the other hand it seems it should work with this
> > migration yet, so it probably needs additional debugging.
>
> Right. I just provided the patch to narrow down the problem, but
> please test the fix of the hrtimer migration code which I sent out a
> bit earlier: http://lkml.org/lkml/2009/7/9/150
>
> It fixes a possible endless loop in the timer code which is related to
> the migration changes. Looking at the backtraces of the spinlock
> lockup I think that is what you hit.

Actually, Andres and Joao hit this, and I hope they'll try these two
patches.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

by Joao Correia :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jul 9, 2009 at 3:24 PM, Jarek Poplawski<jarkao2@...> wrote:

> On Thu, Jul 09, 2009 at 04:15:28PM +0200, Thomas Gleixner wrote:
>> On Thu, 9 Jul 2009, Jarek Poplawski wrote:
>> > On Thu, Jul 09, 2009 at 02:03:50PM +0200, Thomas Gleixner wrote:
>> > > On Thu, 9 Jul 2009, Jarek Poplawski wrote:
>> > > > >
>> > > > > I have the feeling that the code relies on some implicit cpu
>> > > > > boundness, which is not longer guaranteed with the timer migration
>> > > > > changes, but that's a question for the network experts.
>> > > >
>> > > > As a matter of fact, I've just looked at this __netif_schedule(),
>> > > > which really is cpu bound, so you might be 100% right.
>> > >
>> > > So the watchdog is the one which causes the trouble. The patch below
>> > > should fix this.
>> >
>> > I hope so. On the other hand it seems it should work with this
>> > migration yet, so it probably needs additional debugging.
>>
>> Right. I just provided the patch to narrow down the problem, but
>> please test the fix of the hrtimer migration code which I sent out a
>> bit earlier: http://lkml.org/lkml/2009/7/9/150
>>
>> It fixes a possible endless loop in the timer code which is related to
>> the migration changes. Looking at the backtraces of the spinlock
>> lockup I think that is what you hit.
>
> Actually, Andres and Joao hit this, and I hope they'll try these two
> patches.
>
> Thanks,
> Jarek P.
>

I can only try later on today. Will post back as soon as i do it.

Joao Correia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

by Thomas Gleixner :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 9 Jul 2009, Jarek Poplawski wrote:

> On Thu, Jul 09, 2009 at 04:15:28PM +0200, Thomas Gleixner wrote:
> > On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> > > On Thu, Jul 09, 2009 at 02:03:50PM +0200, Thomas Gleixner wrote:
> > > > On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> > > > > >
> > > > > > I have the feeling that the code relies on some implicit cpu
> > > > > > boundness, which is not longer guaranteed with the timer migration
> > > > > > changes, but that's a question for the network experts.
> > > > >
> > > > > As a matter of fact, I've just looked at this __netif_schedule(),
> > > > > which really is cpu bound, so you might be 100% right.
> > > >
> > > > So the watchdog is the one which causes the trouble. The patch below
> > > > should fix this.
> > >
> > > I hope so. On the other hand it seems it should work with this
> > > migration yet, so it probably needs additional debugging.
> >
> > Right. I just provided the patch to narrow down the problem, but
> > please test the fix of the hrtimer migration code which I sent out a
> > bit earlier: http://lkml.org/lkml/2009/7/9/150
> >
> > It fixes a possible endless loop in the timer code which is related to
> > the migration changes. Looking at the backtraces of the spinlock
> > lockup I think that is what you hit.
>
> Actually, Andres and Joao hit this, and I hope they'll try these two
> patches.

Please test them separate from each other. The one I sent in this
thread was just for narrowing down the issue, but I'm now quite sure
that they really hit the issue which is addressed by the hrtimer
patch.

Thanks,

        tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

by Andres Freund :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

On Thursday 09 July 2009 16:28:05 Thomas Gleixner wrote:

> On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> > On Thu, Jul 09, 2009 at 04:15:28PM +0200, Thomas Gleixner wrote:
> > > On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> > > > On Thu, Jul 09, 2009 at 02:03:50PM +0200, Thomas Gleixner wrote:
> > > > > On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> > > > > > > I have the feeling that the code relies on some implicit cpu
> > > > > > > boundness, which is not longer guaranteed with the timer
> > > > > > > migration changes, but that's a question for the network
> > > > > > > experts.
> > > > > >
> > > > > > As a matter of fact, I've just looked at this __netif_schedule(),
> > > > > > which really is cpu bound, so you might be 100% right.
> > > > >
> > > > > So the watchdog is the one which causes the trouble. The patch
> > > > > below should fix this.
> > > >
> > > > I hope so. On the other hand it seems it should work with this
> > > > migration yet, so it probably needs additional debugging.
> > >
> > > Right. I just provided the patch to narrow down the problem, but
> > > please test the fix of the hrtimer migration code which I sent out a
> > > bit earlier: http://lkml.org/lkml/2009/7/9/150
> > >
> > > It fixes a possible endless loop in the timer code which is related to
> > > the migration changes. Looking at the backtraces of the spinlock
> > > lockup I think that is what you hit.
> >
> > Actually, Andres and Joao hit this, and I hope they'll try these two
> > patches.
>
> Please test them separate from each other. The one I sent in this
> thread was just for narrowing down the issue, but I'm now quite sure
> that they really hit the issue which is addressed by the hrtimer
> patch.
No crash yet. 15min running (seconds to a minute before).

Will let it run for some hours to be sure.

Nice!

Andres
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

by Thomas Gleixner :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Andres,

On Thu, 9 Jul 2009, Andres Freund wrote:
> On Thursday 09 July 2009 16:28:05 Thomas Gleixner wrote:
> > Please test them separate from each other. The one I sent in this
> > thread was just for narrowing down the issue, but I'm now quite sure
> > that they really hit the issue which is addressed by the hrtimer
> > patch.
> No crash yet. 15min running (seconds to a minute before).
>
> Will let it run for some hours to be sure.

Which one of the patches are you testing ?

Thanks,

        tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

by Andres Freund :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thursday 09 July 2009 18:01:56 Thomas Gleixner wrote:

> Andres,
>
> On Thu, 9 Jul 2009, Andres Freund wrote:
> > On Thursday 09 July 2009 16:28:05 Thomas Gleixner wrote:
> > > Please test them separate from each other. The one I sent in this
> > > thread was just for narrowing down the issue, but I'm now quite sure
> > > that they really hit the issue which is addressed by the hrtimer
> > > patch.
> >
> > No crash yet. 15min running (seconds to a minute before).
> >
> > Will let it run for some hours to be sure.
>
> Which one of the patches are you testing ?
Your "real" one, i.e. de907e8432b08f2d5966c36e0747e97c0e596810

1h30m now...


Andres
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

by Thomas Gleixner :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Andres,

On Thu, 9 Jul 2009, Andres Freund wrote:

> On Thursday 09 July 2009 18:01:56 Thomas Gleixner wrote:
> > Andres,
> >
> > On Thu, 9 Jul 2009, Andres Freund wrote:
> > > On Thursday 09 July 2009 16:28:05 Thomas Gleixner wrote:
> > > > Please test them separate from each other. The one I sent in this
> > > > thread was just for narrowing down the issue, but I'm now quite sure
> > > > that they really hit the issue which is addressed by the hrtimer
> > > > patch.
> > >
> > > No crash yet. 15min running (seconds to a minute before).
> > >
> > > Will let it run for some hours to be sure.
> >
> > Which one of the patches are you testing ?
> Your "real" one, i.e. de907e8432b08f2d5966c36e0747e97c0e596810
>
> 1h30m now...

Looks like I hit the nail on the head. Queueing it up for Linus.

Thanks,

        tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

by Joao Correia :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jul 9, 2009 at 6:44 PM, Thomas Gleixner<tglx@...> wrote:

> Andres,
>
> On Thu, 9 Jul 2009, Andres Freund wrote:
>
>> On Thursday 09 July 2009 18:01:56 Thomas Gleixner wrote:
>> > Andres,
>> >
>> > On Thu, 9 Jul 2009, Andres Freund wrote:
>> > > On Thursday 09 July 2009 16:28:05 Thomas Gleixner wrote:
>> > > > Please test them separate from each other. The one I sent in this
>> > > > thread was just for narrowing down the issue, but I'm now quite sure
>> > > > that they really hit the issue which is addressed by the hrtimer
>> > > > patch.
>> > >
>> > > No crash yet. 15min running (seconds to a minute before).
>> > >
>> > > Will let it run for some hours to be sure.
>> >
>> > Which one of the patches are you testing ?
>> Your "real" one, i.e. de907e8432b08f2d5966c36e0747e97c0e596810
>>
>> 1h30m now...
>
> Looks like I hit the nail on the head. Queueing it up for Linus.
>
> Thanks,
>
>        tglx
>

Confirmed working from me as well.

Thank you for your time,
Joao Correia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
< Prev | 1 - 2 - 3 | Next >