Interrupts as threads

View: New views
6 Messages — Rating Filter:   Alert me  

Interrupts as threads

by Andrew Doran-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I have been thinking about the lock ordering problem with the kernel big
lock quite a bit and what it will take to lock the MI kernel down, and have
made some observations.

o There is no easy solution to the lock order problem with the kernel_lock
  when using spin locks.

o Using spin locks we will have to keep the SPL above IPL_NONE for longer
  that before, or accept (in non-trivial cases) the undesirable cost of
  having both interrupt and process context locks around some objects.

o Raising and lowering the SPL is expensive, especially on machines that
  need to talk with the hardware on SPL operation. The spin lock path also
  has more test+branch pairs / conditional moves and memory references
  involved than process locks. For a process context lock, the minimum we
  can get away with on entry and exit is one test+branch and two cache line
  references.

o Every spin lock / unlock pair denotes a critical section where threads
  running in the kernel can not be preempted. That's not currently an issue
  but if we move to support real time threads it could become one; I'm not
  sure.

o We are doing too much work from interrupt context.

The cleanest way to deal with these issues that I can see is to use
lightweight threads to handle interrupts. My initial thought is to have one
thread per level, per CPU. These would be able to preempt already running
threads, and would hold preempted threads in-situ until the interrupt thread
returns or switches away. In most cases, SPL operations would be replaced by
locks. Blocking would no longer be prohibited, but strongly discouraged - so
doing something like pool_get(foo, PR_WAITOK) should likely trigger an
assertion.

On something like an x86 or MIPS CPU, we wouldn't need to do a full context
switch for interrupts, just switch onto another stack. For things that are
time critical like clock or audio ISRs I think the current scheme of
deferring the interrupts might be better. Although it should not be common,
the delay involved in switching away when trying to acquire a lock seems
undesirable in these cases. (As an aisde, I have been meaning to do some
profiling to see just how often the SPL operations serve their purpose in a
variety of cases but haven't gotten around to it yet.)

Assuming you subscribe to handling interrupts with threads, it raises the
question: where to draw the line between threaded and 'traditional'. It
certianly makes sense to run soft interrupts this way, and I would draw the
line at higher priority ISRs like network, audio, serial and clock.

Thoughts?

Cheers,
Andrew

Re: Interrupts as threads

by David Laight :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Dec 01, 2006 at 11:31:19PM +0000, Andrew Doran wrote:
>
> o There is no easy solution to the lock order problem with the kernel_lock
>   when using spin locks.

My brief peruse at the biglock code did make it clear how it worked when
an IRQ came in when the 2nd cpu had the biglock...

> o Using spin locks we will have to keep the SPL above IPL_NONE for longer
>   that before, or accept (in non-trivial cases) the undesirable cost of
>   having both interrupt and process context locks around some objects.

SMP code does tend to need mutex protection for longer than the non-SMP
code required IPL protection anyway...

> o Raising and lowering the SPL is expensive, especially on machines that
>   need to talk with the hardware on SPL operation. The spin lock path also
>   has more test+branch pairs / conditional moves and memory references
>   involved than process locks. For a process context lock, the minimum we
>   can get away with on entry and exit is one test+branch and two cache line
>   references.

Can we do deferred SPL changes on all archs?
ie don't frob the hardware, but assume it won't raise an IRQ. If an IRQ
does occur, mask it then and return, re-enabling of the splx call.

> o Every spin lock / unlock pair denotes a critical section where threads
>   running in the kernel can not be preempted. That's not currently an issue
>   but if we move to support real time threads it could become one; I'm not
>   sure.

Once we have a proper SMP kernel, making processes pre-emptable in the kernel
(while not holding a lock) becomes ~free - whereas the non-SMP kernel code
will make assumptions that stop pre-empting.
However you probably want to be able to disable pre-emption whithout
holding a mutex.

> o We are doing too much work from interrupt context.

possibly true...  but the cpu cycles have to be done somewhen, and deferring
to a different context just takes time.

> The cleanest way to deal with these issues that I can see is to use
> lightweight threads to handle interrupts

That just makes it worse! Every hardware ISR would have to disable the
IRQ itself, then the 'low level' ISR would need to re-enable it.

> My initial thought is to have one
> thread per level, per CPU. These would be able to preempt already running
> threads, and would hold preempted threads in-situ until the interrupt thread
> returns or switches away. In most cases, SPL operations would be replaced by
> locks.

You'd have to look very closely at priority inversion problems....

> Blocking would no longer be prohibited, but strongly discouraged - so
> doing something like pool_get(foo, PR_WAITOK) should likely trigger an
> assertion.

If you allow blocking, then people will use it because it 'appears to work'
then you find that one of your ISR threads is busy - not a problem until
several interrupt routines/drivers do it at the same time and you run out
of threads to do the wakeup.

> Assuming you subscribe to handling interrupts with threads, it raises the
> question: where to draw the line between threaded and 'traditional'. It
> certianly makes sense to run soft interrupts this way, and I would draw the
> line at higher priority ISRs like network, audio, serial and clock.

It may make sense to use kernel threads (running through the scheduler)
for some driver activity, and quite probably some of the code scheduled
via 'softint' is in this category.  But IMHO this really needs to be code
this isn't really related to 'interrupt processing'.

        David

--
David Laight: david@...

Re: Interrupts as threads

by Andrew Doran-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Dec 02, 2006 at 10:25:42AM +0000, David Laight wrote:

> On Fri, Dec 01, 2006 at 11:31:19PM +0000, Andrew Doran wrote:
> >
> > o There is no easy solution to the lock order problem with the kernel_lock
> >   when using spin locks.
>
> My brief peruse at the biglock code did make it clear how it worked when
> an IRQ came in when the 2nd cpu had the biglock...

The problem is that at the moment, acquiring any given spin lock is a
deadlock waiting to happen unless you are at ~IPL_SCHED or above when
acquiring it, or are certain that it will only ever be acquired with the
kernel lock unheld. Ensuring that the kernel lock is always held when you
acquire the spin lock means that the spin lock is useless :-).

It would not be a major undertaking to push that down to IPL_VM by making
other interrupt handlers MP safe, the bulk of which are audio drivers I
think. Once you move beyond those drivers, things start to get more
difficult and I'm not convinced that we have the resources to tackle the
issue that way.

> > o Using spin locks we will have to keep the SPL above IPL_NONE for longer
> >   that before, or accept (in non-trivial cases) the undesirable cost of
> >   having both interrupt and process context locks around some objects.
>
> SMP code does tend to need mutex protection for longer than the non-SMP
> code required IPL protection anyway...

Yup.
 

> > o Raising and lowering the SPL is expensive, especially on machines that
> >   need to talk with the hardware on SPL operation. The spin lock path also
> >   has more test+branch pairs / conditional moves and memory references
> >   involved than process locks. For a process context lock, the minimum we
> >   can get away with on entry and exit is one test+branch and two cache line
> >   references.
>
> Can we do deferred SPL changes on all archs?
> ie don't frob the hardware, but assume it won't raise an IRQ. If an IRQ
> does occur, mask it then and return, re-enabling of the splx call.

We probably could, but it's still expensive to acquire spin locks that
modify the SPL. Have a look at arch/i386/i386/lock_stubs.S on the newlock2
branch and compare the paths for spin vs. adaptive mutexes. (I know that at
least _lock_set_waiters() is broken. Also, I haven't optimised x86 assembly
in a long time - please don't laugh. ;-)

> > o Every spin lock / unlock pair denotes a critical section where threads
> >   running in the kernel can not be preempted. That's not currently an issue
> >   but if we move to support real time threads it could become one; I'm not
> >   sure.
>
> Once we have a proper SMP kernel, making processes pre-emptable in the kernel
> (while not holding a lock) becomes ~free - whereas the non-SMP kernel code
> will make assumptions that stop pre-empting.
> However you probably want to be able to disable pre-emption whithout
> holding a mutex.

Yup, agreed.
 
> > The cleanest way to deal with these issues that I can see is to use
> > lightweight threads to handle interrupts
>
> That just makes it worse! Every hardware ISR would have to disable the
> IRQ itself, then the 'low level' ISR would need to re-enable it.

I don't see why we would have to do that. The way I envisoned this working,
for a typical interrupt path that doesn't have to block on any locks, the
prologue and epliogue for the interrupt itself (not the individual ISRs)
would have a few additional steps. Something like:

entry: pick an idle interrupt LWP from curcpu
                note which LWP we have preempted
                set curlwp
                switch onto new stack

exit: restore curlwp
                switch back to old stack

If one of the ISRs needs to block, then we would have to switch back to the
LWP that was preempted and hold the SPL at the ISR's base level until the
ISR receives the lock. So in effect, things like splx() would become a
request to alter the SPL, but wouldn't be able to drop it below the base
level for the CPU (which would typically be IPL_NONE).
 
> > My initial thought is to have one
> > thread per level, per CPU. These would be able to preempt already running
> > threads, and would hold preempted threads in-situ until the interrupt thread
> > returns or switches away. In most cases, SPL operations would be replaced by
> > locks.
>
> You'd have to look very closely at priority inversion problems....

True.. The turnstile mechanism would handle this for locks, but that's
currently implemented.
 
> > Blocking would no longer be prohibited, but strongly discouraged - so
> > doing something like pool_get(foo, PR_WAITOK) should likely trigger an
> > assertion.
>
> If you allow blocking, then people will use it because it 'appears to work'
> then you find that one of your ISR threads is busy

I considered that but I feel that it's really a documentation and code
review issue. There are plenty of places we can put assertions though. For
instance around memory allocation, or direct use of the tsleep() / cv_wait()
interfaces. In any case there are plenty of inadvisable things that you can
do from an interrupt handler already, and we have drivers that already do (I
wrote one of them :-).

> > Assuming you subscribe to handling interrupts with threads, it raises the
> > question: where to draw the line between threaded and 'traditional'. It
> > certianly makes sense to run soft interrupts this way, and I would draw the
> > line at higher priority ISRs like network, audio, serial and clock.
>
> It may make sense to use kernel threads (running through the scheduler)
> for some driver activity, and quite probably some of the code scheduled
> via 'softint' is in this category.

Sure, but there is a cutoff point: that introduces the full overhead of the
scheduler in addition to taking the interrupt. If the softint handler is an
interrupt thread, then that doesn't happen unless absolutely necessary.

Thanks,
Andrew

Re: Interrupts as threads

by Bill Stouder-Studenmund :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Dec 03, 2006 at 10:36:30PM +0000, Andrew Doran wrote:

> On Sat, Dec 02, 2006 at 10:25:42AM +0000, David Laight wrote:
>
> > On Fri, Dec 01, 2006 at 11:31:19PM +0000, Andrew Doran wrote:
> > >
> > > o There is no easy solution to the lock order problem with the kernel_lock
> > >   when using spin locks.
> >
> > My brief peruse at the biglock code did make it clear how it worked when
> > an IRQ came in when the 2nd cpu had the biglock...
>
> The problem is that at the moment, acquiring any given spin lock is a
> deadlock waiting to happen unless you are at ~IPL_SCHED or above when
> acquiring it, or are certain that it will only ever be acquired with the
> kernel lock unheld. Ensuring that the kernel lock is always held when you
> acquire the spin lock means that the spin lock is useless :-).
I've been thinking about this, and I think you are not correct. Well. In
the long-term, you are. But I think as a transition step, we may have to
accept it.

All we have to do is define a correct locking hierarcy. It's ok to aquire
a given spin lock if you have the kernel lock. It's ok to aquire said
spinlock w/o the kernel lock. It's just NOT ok to aquire the kernel lock
while holding said lock. Yes, this could make interupt handling routines
painful (when they hand things off to the main kernel), but we can do it.

Take care,

Bill


attachment0 (193 bytes) Download Attachment

Re: Interrupts as threads

by Andrew Doran-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Bill,

On Tue, Dec 19, 2006 at 04:05:24PM -0800, Bill Studenmund wrote:

> On Sun, Dec 03, 2006 at 10:36:30PM +0000, Andrew Doran wrote:

> > The problem is that at the moment, acquiring any given spin lock is a
> > deadlock waiting to happen unless you are at ~IPL_SCHED or above when
> > acquiring it, or are certain that it will only ever be acquired with the
> > kernel lock unheld. Ensuring that the kernel lock is always held when you
> > acquire the spin lock means that the spin lock is useless :-).
>
> I've been thinking about this, and I think you are not correct. Well. In
> the long-term, you are. But I think as a transition step, we may have to
> accept it.

I'm not sure I follow. We did discuss this briefly and I think that you were
proposing a scheme where we have the indivdual interrupt handlers acquire
the kernel lock, although I might have completely misunderstood. :-)

> All we have to do is define a correct locking hierarcy. It's ok to aquire
> a given spin lock if you have the kernel lock. It's ok to aquire said
> spinlock w/o the kernel lock. It's just NOT ok to aquire the kernel lock
> while holding said lock. Yes, this could make interupt handling routines
> painful (when they hand things off to the main kernel), but we can do it.

The problem as I see it is: getting it right all at once would be a big
effort, and I don't see that we have the resources to do it that way.

I said that I wanted to use interrupts as threads, and process context locks
(the Solaris style mutexes) for MP safety. The two basic problems I want to
solve by doing that are:

o The one mentioned above - if the locks are difficult to use (potential
  deadlocks against the kernel_lock) then that's a problem.  For places
  where we want MP safety (not MT safety) the mutexes can be thought of as
  spin locks, but with additional property that they can block in order to
  avoid deadlock. That might be against the current CPU already holding the
  mutex (preemption), or against the CPU that already holds the mutex
  wanting the big lock. Essentially, they remove the ordering constraint
  against the big lock.

o If we keep the distinction between interrupt and process context across
  the board, then we're likely to increase the number of locks in the in
  kernel: one set for interrupt access, and one for process access. I've
  spent a lot of time trying to make process and LWP state MP safe for
  signalling. Even though it works well enough I'm not particularly happy
  with the end result, because there's a mix of locks where we only need
  ~half the number. It's not good to use spinlocks with a raised SPL in a
  lot of places, because we might need to block briefly on a process context
  lock, or in the overall picture we end up holding interrupts for much
  longer.

I did some simple profiling on the SPL operations and here is example output
from a run that I did. It's from a single CPU machine serving up files at
line rate using 8 ftpds over 100Mbps Ethernet, and doing some disk I/O
locally. I wasn't looking for any specific kind of behaviour other than
'doing I/O'. The machine is set up so that there is 1:1 mapping between each
symbolic interrupt level and each hardware IRQ line.

   1   2 3    4 5

alevel    |    intrs   persec |  blkself   persec | blkother   persec |  splraise   persec
---------+-------------------+-------------------+-------------------+--------------------
softnet  |   274094     6334 |    47549     1098 |        0        0 |    572849    13239
bio      |    17939      414 |      252        5 |        0        0 |   1415581    32717
net      |   373121     8623 |    29857      690 |     1760       40 |    631179    14588
tty      |        0        0 |        0        0 |        3        0 |       428        9
vm       |        0        0 |        0        0 |    11646      269 |   5211031   120438
audio    |        0        0 |        0        0 |        0        0 |         0        0
clock    |     4328      100 |      353        8 |    33913      783 |   1156023    26718
high     |        0        0 |        0        0 |      112        2 |    291410     6735
ipi      |        0        0 |        0        0 |        8        0 |       435       10

The columns are:

1. Interrupt level.
2. Number of interrupts at that level.
3. Number of splfoo() operations that blocked an interrupt at level foo.
   This doesn't include the SPL adjustment that MD code does as part of
   taking the interrupt.
4. Number of splfoo() operations that blocked an interrupt below level
   foo. Note that this does not count blocked soft interrupts.
5. Number of splfoo() operations.

Given the amount of contention that this _one_ test shows it's clear that we
can't just use interrupts as threads and replace the SPL system with locks,
or we will really suffer from all the additional context switching. At
least, we can't do that without a major engineering effort to change how
work is passed from one level to the next.

I think that we can also cut the number of spl calls significantly. For one,
the number of splvm() calls indicated above seems a bit excessive.

What I propose is using interrupts as threads for IPL_VM and below, but also
preserving the SPL system. We would certianly need to use it for networking.
In other areas (like block I/O) where there aren't so many chokepoints I'm
inclined to believe that we can get away without it.

I don't think interrupts as threads have to be expensive: typically just a
stack switch and some additional accounting in the interrupt path. In
addition to the interrupt coming in, there would need to be defined
premption points, like lock release or splx(). That's in contrast to (to
use an example that people have cited) FreeBSD, where the initial approach
was a lot more costly as I understand it: in addition to taking the
interrupt, a thread had to be picked and scheduled to run at a later time,
for example on return to user space. I don't know how their system works
these days.

Thoughts?

Cheers,
Andrew

Re: Interrupts as threads

by Perry E. Metzger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Andrew Doran <ad@...> writes:
> The problem as I see it is: getting it right all at once would be a big
> effort, and I don't see that we have the resources to do it that way.

For what it is worth, I think an implementation path that allows
incremental steps that each are worthwhile (more or less what I see
you saying) is better than one that requires a massive application of
resources we don't have.

As for the particular decisions made, I have a bias towards letting
the person actually doing work have the biggest say unless they're
acting crazy, and you aren't acting crazy. You are clearly leading
this right now by doing the actual programming.

Perry