[PATCH] fix improper return of local sender TID in propagated IPC

View: New views
3 Messages — Rating Filter:   Alert me  

[PATCH] fix improper return of local sender TID in propagated IPC

by Kalle A. Sandstrom :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


The attached patch[1] fixes a problem where a recipient may get the thread
ID of a propagating thread rather than the propagator's VirtualSender TCR.

This misbehaviour was observed when the propagator and the recipient were in
one address space and the original sender was in another, and the propagator
went through a blocking send phase to deliver the message. The issue was
that at the end of the last function in kernel/src/api/v4/ipc.cc,
from_tcb->get_global_id() was used interchangeably with
current->get_partner() when in the propagating case they almost certainly
refer to distinct threads.

As for why this was only observed in a blocking-send case and not in the
non-blocking case, I have no idea. I only drilled down from the return value
observed by the recipient, and correcting it seemed the right thing to do.


The changeset also enforces, using an ASSERT(), the notion that the virtual
sender in IPC propagation is always identified by global TID. This is
supported by lines 313-329 in kernel/src/api/v4/ipc.cc, which silently
disables[2] propagation if VirtualSender is not a global TID (as a local TID
cannot be equal to the result of get_global_id() of the virtual sender's
tcb_t).

As requiring that the VirtualSender TCR always contain a global TID when
propagation is specified in the message tag, I'd like to recommend that the
L4.X2 spec be revised accordingly.


If it matters for a short patch like this, the attached changeset is
licensed under the two-clause BSD license as given at the start of the
source file it modifies.


[1] hg bundle -- I hope this is the correct format

[2] IMO, silently disabling propagation is the wrong thing to do as it tells
    the sender that the IPC operation succeeded as specified when it did
    not. This behaviour also diverges from that of Pistachio 0.4 .

--
Kalle A. Sandstro"m                                        ksandstr@...
746B 4B14:              BFB5 6D3B 0758 CFBE 11F9  DF41 4C28 67FB 746B 4B14
void *truth = &truth;                              http://ksandstr.iki.fi/


pistachio-propagate-partner-properly.hg (1K) Download Attachment

RE: [PATCH] fix improper return of local sender TID in propagated IPC

by Jan Stoess :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello Kalle,

Thanks for the patch. It makes sense to me, and I have committed it to the repository. Apparently the local ID logic doesn't handle the propagation case very well.

> As requiring that the VirtualSender TCR always contain a global TID
> when propagation is specified in the message tag, I'd like to recommend
> that the
> L4.X2 spec be revised accordingly.

Hmm. I actually wouldn't want change the spec. The implementation follows the spec, not vice versa. And just because the implementation doesn't handle every (corner) case or contains bugs, I don't see any conceptual reason why local IDs shouldn't be used when propagating messages.

> If it matters for a short patch like this, the attached changeset is
> licensed under the two-clause BSD license as given at the start of the
> source file it modifies.

Yes, that actually matters. BSD license is fine, though. Thanks.

> [2] IMO, silently disabling propagation is the wrong thing to do as it
> tells
>     the sender that the IPC operation succeeded as specified when it
> did
>     not. This behaviour also diverges from that of Pistachio 0.4 .

That's debatable. The spec says: "If originator thread and current sender, or
current sender and receiver reside in the same address space, propagation is always permitted.
Otherwise, IPC occurs unpropagated.". The implementation actually implements that behavior, and it seems to be an unwanted side-effect that local IDs don't work there and that the implementation then continues with the IPC, but unpropagated. An ASSERT catching local IDs might be more appropriate in this case.

Best,
-Jan

--
Jan Stoess
System Architecture Group
University of Karlsruhe
Phone: +49 (721) 608-4056
Fax: +49 (721) 608-7664
eMail: stoess@...



Re: [PATCH] fix improper return of local sender TID in propagated IPC

by Kalle A. Sandstrom :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Apr 08, 2009 at 02:24:24PM +0200, Jan Stoess wrote:

> Thanks for the patch. It makes sense to me, and I have committed it to the
> repository. Apparently the local ID logic doesn't handle the propagation
> case very well.
>

No doubt there could be a more elegant solution as well, but that one seems
to do the locally right thing, for what it's worth.

> > As requiring that the VirtualSender TCR always contain a global TID when
> > propagation is specified in the message tag, I'd like to recommend that
> > the L4.X2 spec be revised accordingly.
>
> Hmm. I actually wouldn't want change the spec. The implementation follows
> the spec, not vice versa. And just because the implementation doesn't
> handle every (corner) case or contains bugs, I don't see any conceptual
> reason why local IDs shouldn't be used when propagating messages.
>

That sounds reasonable too, i.e. the kernel can convert a local ID in
VirtualSender to the appropriate ID for returning to the IPC recipient where
necessary -- entirely regardless of which format the ID given by the IPC
sender was in.

It's good to know this was just an implementation artifact.

> > [2] IMO, silently disabling propagation is the wrong thing to do as it
> > tells the sender that the IPC operation succeeded as specified when it
> > did not. This behaviour also diverges from that of Pistachio 0.4 .
>
> That's debatable. The spec says: "If originator thread and current sender,
> or current sender and receiver reside in the same address space,
> propagation is always permitted. Otherwise, IPC occurs unpropagated.". The
> implementation actually implements that behavior, and it seems to be an
> unwanted side-effect that local IDs don't work there and that the
> implementation then continues with the IPC, but unpropagated.

I should've said "... that the IPC operation succeeded as expected by the
caller when it did not", i.e. my critique was toward the spec rather than
the implementation.

It would be interesting to learn the rationale for this behaviour. From a
naive reading of the way IPC is specified in L4.X2, propagation feels like a
hack (i.e. the designers having added a new behaviour to the IPC syscall and
a new TCR, but keeping the existing set of error conditions so as to re-use
existing code without modification) -- but that sounds implausible given how
it fits with clans-and-chiefs, which (IIRC) has been around for quite a
while.

> An ASSERT catching local IDs might be more appropriate in this case.
>

Agreed. Even if non-propagation with "p" set were reported as an error
status, there don't seem to be any obvious things that the sender could do
to recover besides logging the unexpected condition and general attempts to
return to a known-good state.

On the other hand, delivering a "not quite propagated" message to the
recipient may cause all sorts of confusion in operations where the sender's
ID is significant. On the third hand, a case can be made that regardless of
whether delivery occurs, the system as a whole is already behaving in an
undefined manner and that lossage of some kind is bound to occur. (Side
effects of such a misdelivery would presumably be limited to things that are
identified by the ID of the propagator, and so wouldn't be devastating by
themselves.)

--
Kalle A. Sandstro"m                                        ksandstr@...
746B 4B14:              BFB5 6D3B 0758 CFBE 11F9  DF41 4C28 67FB 746B 4B14
void *truth = &truth;                              http://ksandstr.iki.fi/