> On Thu, 14 Jun 2012 02:12:06 +1000
> Darren Reed <darrenr@...> wrote:
>>> That works on the receive path, however on the send path we still want
>>> in some scenarios to know, on the sender machine, when the packet was
>>> sent. sendmsg does not return information, so we must call recvmsg to
>>> get the ancillary data. The data is obtained by calling recvmsg on the
>>> error queue, where the copy of the sent packet, together with the
>>> ancillary data that tells us when the packet was sent, is waiting to
>>> be read.
>> And how do you tie something received with recvmsg() with something sent
>> via sendmsg()?
> I was wondering the same: should one then recvmsg(2) from the error
> queue every packet after sendmsg(2) to ensure synchronization? What
> happens if the application doesn't, will those accumulate in the "error
> queue", or is there room for a single packet there? Another common
> idiom would be getsockopt(2) to obtain status, but this has the same
I think they return the entire transmitted packet to the error queue, so
the contents of the packet are available to use to figure out which transmitted
packet the timestamp corresponds to. If the application is one of the more
likely users (i.e. NTP or PTP) this is straight forward to do since there will
be (a) unique-to-the-packet timestamp(s) in there which the application will be
preserving anyway. If the error queue packets also preserve the sendmsg()
order then drops can be detected by observing when packets which are expected to
be in the error queue are missing in the received sequence. I think the
queue can be maintained like any datagram socket receive buffer; if you don't
read it regularly it may fill to its limit, after which subsequent packets will
be dropped off the end of the queue. I think this works.
The reason I'm not fond of this, as I understand it, is that for Linux that
error queue was preexisting mechanism which was used for other things, so using
it for this as well was "free" in that it leveraged existing, common mechanism.
In contrast BSD kernels have managed to get by without this for other purposes
so this would be a mechanism that would be added to (all?) sockets on the off
chance that the application is going to want to timestamp outbound packets,
something which I suspect very few processes are going to be interested in.
Unless there is some sense that the "error queue" thing is going to be useful
for a wider variety of applications than just this. If timestamping packets
is the only application for this it might be better to design a mechanism which
only needs to exist when the application declares that it is going to be timestamping
packets (perhaps using a setsockopt()/getsockopt() protocol). If some sort of
"transaction ID" were included with the original sendmsg() then it wouldn't be
required to return the entire transmitted packet to the application, just the
timestamp and the "transaction ID" would do it.
> It also makes the protocol synchronous, with two syscalls required per
> sent packet. I wonder if there's precedent on some other OSs for a
> sendmsg(2) variant which can accept a struct msghdr *? If so, would
> this be realistic using our current network+device stack to have the
> status filled before the syscall returns?
I guess the difficulty with this is that the underlying transmit timestamp
mechanism is actually asynchronous with respect to the sendmsg() syscall.
That is, the syscall to send a datagram is generally finished when the packet
is queued at the tail of the transmit queue of the output interface but the
timestamp is unavailable from the hardware until after the packet is output
to the wire. There is a variable, and possibly quite large, time between
these two events, so returning the result to a single system call is going
to require a sleep to wait for completion where no sleep is done now.