Code patching, context switching

View: New views
9 Messages — Rating Filter:   Alert me  

Code patching, context switching

by Sandro Magi-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I want to use Lightning for user-level context switching, similar to
GNU Pth. I would expect it to be much faster since it's in assembly,
and signals aren't handled specially. This raises two questions
though:

 1. There is currently no Lightning equivalent to the x86's 'pusha'
instruction [1], correct?
 2. I assume then that I would have to save all registers by iterating
over JIT_R/V/FPR manually while generating this code. Are there any
registers or other state that aren't accessible via Lightning that
might impact context switching?

I'm also curious what it would take to save "machine contexts" in
Lightning. Some architectures provide special instructions for saving
the register file which I assume is more efficient than saving them
one by one in the above way.

Finally, since this function must be linked with some other code I
have at compile time, so I need some sort of stub to act as a
placeholder with sufficient room to patch the code later. For
instance:

/* from and to are buffers sufficiently large to hold the register file */
static void ctxt_switch(char *from, char *to) {
  from = from;
  to = to;
  from = from;
  to = to;
  ...
}

I can then take the address of ctxt_switch and pass it to Lightning as
the code buffer and patch the contents of ctxt_switch to perform an
actual context switch. The other alternative is to simply patch in a
direct jump at &ctxt_switch into my code generated elsewhere. Any
other thoughts or recommendations?

Sandro

[1] http://docs.sun.com/app/docs/doc/817-5477/6mkuavhri?a=view


_______________________________________________
Lightning mailing list
Lightning@...
http://lists.gnu.org/mailman/listinfo/lightning

Re: Code patching, context switching

by Paolo Bonzini-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I want to use Lightning for user-level context switching, similar to
> GNU Pth.

Why not use makecontext/swapcontext?  (which is what Pth uses except for
signal handling).

>  1. There is currently no Lightning equivalent to the x86's 'pusha'
> instruction [1], correct?
>  2. I assume then that I would have to save all registers by iterating
> over JIT_R/V/FPR manually while generating this code. Are there any
> registers or other state that aren't accessible via Lightning that
> might impact context switching?

Yes.  On the PowerPC, the argument registers are *not* available to the
user, because they are caller-save while GNU lightning always provides
the illusion that they are callee-save.

> /* from and to are buffers sufficiently large to hold the register file */
> static void ctxt_switch(char *from, char *to) {
>   from = from;
>   to = to;
>   from = from;
>   to = to;
>   ...
> }
>
> I can then take the address of ctxt_switch and pass it to Lightning as
> the code buffer and patch the contents of ctxt_switch to perform an
> actual context switch. The other alternative is to simply patch in a
> direct jump at &ctxt_switch into my code generated elsewhere. Any
> other thoughts or recommendations?

I think that the code is readonly on all the platforms I ever looked at
that have an MMU.  You would have to use mprotect first; then I think
that patching a direct jump is fast enough.

However, beware the compiler.  It will remove all those pseudo-NOP
statements that (it looks like this, at least) you placed to make
ctxt_switch big enough.  It might also inline ctxt_switch, which would
screw things up a lot.  I would just make ctxt_switch a function pointer.

Paolo


_______________________________________________
Lightning mailing list
Lightning@...
http://lists.gnu.org/mailman/listinfo/lightning

Re: Code patching, context switching

by Sandro Magi-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Feb 20, 2008 at 1:56 AM, Paolo Bonzini <bonzini@...> wrote:
> > I want to use Lightning for user-level context switching, similar to
>  > GNU Pth.
>
>  Why not use makecontext/swapcontext?  (which is what Pth uses except for
>  signal handling).

Pth uses a whole bunch of different methods, including makecontext and
fiddling with jmp_buf, depending on what's available. makecontext and
friends are more or less deprecated interfaces. An assembly generated
context switch is always going to be faster though, and I believe some
of those calls invoke the kernel, in which case I pretty much lose the
advantage of user-level threads.

>  >  1. There is currently no Lightning equivalent to the x86's 'pusha'
>  > instruction [1], correct?
>  >  2. I assume then that I would have to save all registers by iterating
>  > over JIT_R/V/FPR manually while generating this code. Are there any
>  > registers or other state that aren't accessible via Lightning that
>  > might impact context switching?
>
>  Yes.  On the PowerPC, the argument registers are *not* available to the
>  user, because they are caller-save while GNU lightning always provides
>  the illusion that they are callee-save.

I don't get it. Are you saying that the PPC's entire register file is
not accessible via JIT_R, JIT_V and JIT_FPR? Skimming this intro to
PPC doesn't point out anything immediately wrong with what I'm
proposing due to callee vs caller-save conventions:

http://www.ibm.com/developerworks/linux/library/l-ppc/

I was under the impression that PPC's general purpose register are
truly general purpose, like that article says.  If it's not possible,
how is setjmp/longjmp implemented on PPC? Hmm, just found this library
using portable setjmp fiddling and it supports PPC:

http://www.cs.uiowa.edu/~jones/opsys/threads/

I'm wary of setjmp though, because I've inspected the Windows setjmp.h
header, and it doesn't seem to save the floating point registers on
x86. Anyway, it sounds like you're implying this is a Lightning
limitation, so any clarification is much appreciated. :-)

>  > I can then take the address of ctxt_switch and pass it to Lightning as
>  > the code buffer and patch the contents of ctxt_switch to perform an
>  > actual context switch. The other alternative is to simply patch in a
>  > direct jump at &ctxt_switch into my code generated elsewhere. Any
>  > other thoughts or recommendations?
>
>  I think that the code is readonly on all the platforms I ever looked at
>  that have an MMU.  You would have to use mprotect first; then I think
>  that patching a direct jump is fast enough.
>
>  However, beware the compiler.  It will remove all those pseudo-NOP
>  statements that (it looks like this, at least) you placed to make
>  ctxt_switch big enough.  It might also inline ctxt_switch, which would
>  screw things up a lot.  I would just make ctxt_switch a function pointer.

Yup, I was concerned about inlining, but that's solvable. I don't want
to use a function pointer for the same reason that I'm avoiding CPS:
indirect calls tend to flush the pipeline. It's a weaker argument
here, because the branch predictor will probably cache the target, and
since the target never changes, it should only flush on the first
call, but I don't want to rely on a decent branch predictor.

Sandro


_______________________________________________
Lightning mailing list
Lightning@...
http://lists.gnu.org/mailman/listinfo/lightning

Re: Code patching, context switching

by Paolo Bonzini-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


>>  Yes.  On the PowerPC, the argument registers are *not* available to the
>>  user, because they are caller-save while GNU lightning always provides
>>  the illusion that they are callee-save.
>
> I don't get it. Are you saying that the PPC's entire register file is
> not accessible via JIT_R, JIT_V and JIT_FPR?

Yes.  r3-r9, which the ABI uses for argument passing, are not accessible
at all.  In the prolog, they are moved to other, callee-save registers,
which are also not accessible with JIT_{R,V} but only with jit_getarg.

> I'm wary of setjmp though, because I've inspected the Windows setjmp.h
> header, and it doesn't seem to save the floating point registers on
> x86. Anyway, it sounds like you're implying this is a Lightning
> limitation, so any clarification is much appreciated. :-)

Yes, it is.

Paolo


_______________________________________________
Lightning mailing list
Lightning@...
http://lists.gnu.org/mailman/listinfo/lightning

Re: Code patching, context switching

by Sandro Magi-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Feb 20, 2008 at 11:55 AM, Paolo Bonzini <bonzini@...> wrote:
>
>  > I don't get it. Are you saying that the PPC's entire register file is
>  > not accessible via JIT_R, JIT_V and JIT_FPR?
>
>  Yes.  r3-r9, which the ABI uses for argument passing, are not accessible
>  at all.  In the prolog, they are moved to other, callee-save registers,
>  which are also not accessible with JIT_{R,V} but only with jit_getarg.

That's unfortunate. Is there a good reason for this artificial
limitation? Seems Lightning would be much more flexible as a code
generator with a low-level interface, which is what I assumed
JIT_R/V/FPR were, and a higher-level interface, which is what
jit_getarg and jit_prolog are. I can understand wanting to insulate
the users from inadvertently clobbering the argument registers, but a
low-level api would still be nice.

Anyway, jit_prolog and friends are fine for now, I'm just trying to
plan out my development strategy, so I'll look into this further at
some future point.

Sandro


_______________________________________________
Lightning mailing list
Lightning@...
http://lists.gnu.org/mailman/listinfo/lightning

Parent Message unknown Re: Code patching, context switching

by Sandro Magi-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Feb 20, 2008 at 12:19 PM, Paolo Bonzini <bonzini@...> wrote:

>
>  > That's unfortunate. Is there a good reason for this artificial
>  > limitation? Seems Lightning would be much more flexible as a code
>  > generator with a low-level interface, which is what I assumed
>  > JIT_R/V/FPR were, and a higher-level interface, which is what
>  > jit_getarg and jit_prolog are. I can understand wanting to insulate
>  > the users from inadvertently clobbering the argument registers, but a
>  > low-level api would still be nice.
>
>  The problem is that jit_pusharg needs to clobber r3-r9, and it would be
>  an unnecessarily complication to say "this and that JIT_Rx register" are
>  clobbered by jit_pusharg, but only on this and that architecture".

Yes, that's what I meant, which is why I suggested a nested API. The
ideal structure in my mind, is a full "Hardware Abstraction Layer",
and a simple template API built on it. The simple API consists of
using prolog, getarg, and pusharg, but not exposing any registers. The
user thus deals only with "variables", and this simple API uses a
naive linear register allocation expressible via macros. For instance,
on every pusharg, increment the last allocated register by one, then
wrap once you reach JIT_R_NUM, and push JIT_R(0) on the stack. Pusharg
can even accept an additional parameter as a hint (like C's 'register'
storage class), to help it determine whether to allocate it to a
register, or push it on the stack.

The low-level API then exposes the full register set for those who
actually want to generate their own code. I think this achieves the
optimal balance between ease of use, and flexibility. What do you
think?

Sandro


_______________________________________________
Lightning mailing list
Lightning@...
http://lists.gnu.org/mailman/listinfo/lightning

Re: Code patching, context switching

by Paolo Bonzini-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


> For instance,
> on every pusharg, increment the last allocated register by one, then
> wrap once you reach JIT_R_NUM, and push JIT_R(0) on the stack.

Are we talking about the same pusharg? :-)

> Pusharg
> can even accept an additional parameter as a hint (like C's 'register'
> storage class), to help it determine whether to allocate it to a
> register, or push it on the stack.

Whether an argument goes in a register or on the stack is mandated by
the ABI.

> The low-level API then exposes the full register set for those who
> actually want to generate their own code.

There is already a kind of low-level ABI.  What I could do would be to
provide a JIT_ALL_REGS_NUM macro, a jit_all_regs[] array, and three
macros (e.g. JIT_IS_REG_CALLER_SAVE, JIT_IS_REG_CALLEE_SAVE,
JIT_IS_REG_RESERVED) to tell you whether a register is caller save,
callee save or reserved for use by lightning macros.  I'd prefer if you
wrote the patch though. :-)

Paolo


_______________________________________________
Lightning mailing list
Lightning@...
http://lists.gnu.org/mailman/listinfo/lightning

Re: Code patching, context switching

by Sandro Magi-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Feb 20, 2008 at 11:55 AM, Paolo Bonzini <bonzini@...> wrote:
>
>  Yes.  r3-r9, which the ABI uses for argument passing, are not accessible
>  at all.  In the prolog, they are moved to other, callee-save registers,
>  which are also not accessible with JIT_{R,V} but only with jit_getarg.

Actually, it occurs to me that I don't need access to caller-save
registers to implement context switching. My context switch function
will simply be a function that I call using prolog, and all
caller-save registers will thus already be saved on the stack and
restored on return after switching stacks. I then simply need to save
the registers accessible Lightning's JIT_V/FRP. Am I missing anything
here?

One potential pitfall is expensive floating-point state. Is there a
way to determine whether the floating point registers have been used
since the last context switch? I'd rather not have to save them if it
can be avoided. I'm currently reading the fenv.h header [1] which
discusses the portable C interface. By the way, Boost.Coroutine has
done an impressive analysis of the issues involved.

Sandro

[1] http://www.opengroup.org/onlinepubs/009695399/basedefs/fenv.h.html
[2] http://www.crystalclearsoftware.com/soc/coroutine/index.html


_______________________________________________
Lightning mailing list
Lightning@...
http://lists.gnu.org/mailman/listinfo/lightning

Re: Code patching, context switching

by Paolo Bonzini-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Sandro Magi wrote:

> On Wed, Feb 20, 2008 at 11:55 AM, Paolo Bonzini <bonzini@...> wrote:
>>  Yes.  r3-r9, which the ABI uses for argument passing, are not accessible
>>  at all.  In the prolog, they are moved to other, callee-save registers,
>>  which are also not accessible with JIT_{R,V} but only with jit_getarg.
>
> Actually, it occurs to me that I don't need access to caller-save
> registers to implement context switching. My context switch function
> will simply be a function that I call using prolog, and all
> caller-save registers will thus already be saved on the stack and
> restored on return after switching stacks. I then simply need to save
> the registers accessible Lightning's JIT_V/FRP. Am I missing anything
> here?

No, I don't think so; wwhat you say should work.

Paolo



_______________________________________________
Lightning mailing list
Lightning@...
http://lists.gnu.org/mailman/listinfo/lightning