[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

View: New views
19 Messages — Rating Filter:   Alert me  

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


URL:
  <http://savannah.nongnu.org/bugs/?27445>

                 Summary: PCB hangs in Fast Retransmit due to unchanging cwnd
                 Project: lwIP - A Lightweight TCP/IP stack
            Submitted by: smalloy
            Submitted on: Mon 14 Sep 2009 04:25:02 PM GMT
                Category: TCP
                Severity: 3 - Normal
              Item Group: Faulty Behaviour
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
         Planned Release:
            lwIP version: Other

    _______________________________________________________

Details:

I'm running 1.3.1, modified to simulate remote-end packet drops.  (There's a
1/1000 chance that I'll drop 3 packets in a row every time *if_output() is
called).

The symptoms:

After a short period of time, communication on the pcb would freeze.  The PCB
would show that TF_INFR was set, that the unacked queue was empty, and that
there were several segments present on the unsent queue.

tcp_output was being called on the PCB, but no traffic was going on the wire.
 This was because I was failing to enter the while loop around
tcp_do_output_nagle() because cwnd was set to 1*MSS.  The unsent queue was
never being drained, and we were dead in the water.

The problem:

RFC2581, Section 3.2, bullet point 3 says:

   3.  For each additional duplicate ACK received, increment
       cwnd by SMSS.  This artificially inflates the congestion
       window in order to reflect the additional segment that
       has left the network.

tcp_receive is structured so that cwnd will not inflate with every duplicate
ACK once a retransmit has been done (that is, once unacked is NULL, and
everything has been moved to the unsent queue with TF_INFR set).  If one of
the retransmitted packets is also lost and cwnd is too small for transmission,
cwnd will never grow to the point where lwip will attempt to re-retransmit the
packets on the unsent queue.

Proposed solution

Making the following change to tcp_receive() will cause cwnd to grow with
every duplicate ACK, regardless of whether or not a retransmit has already
been attempted.

I am unsure of the other implications of this change.

770,771c770,771
<         if (pcb->dupacks >= 3) {
<           if (!(pcb->flags & TF_INFR) && pcb->unacked != NULL) {
---
>         if (pcb->dupacks >= 3 && pcb->unacked != NULL) {
>           if (!(pcb->flags & TF_INFR)) {






    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Update of bug #27445 (project lwip):

             Assigned to:                    None => kieranm                
         Planned Release:                         => 1.4.0                  

    _______________________________________________________

Follow-up Comment #1:

Thank you for such a good bug report.

The fix looks fine to me.  I'll look it over in more detail and check this in
when I next get the chance.

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Update of bug #27445 (project lwip):

         Planned Release:                   1.4.0 => 1.3.2                  


    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #2, bug #27445 (project lwip):

Kieran, I just noticed these are the same lines of code I changed to fix bug
#27329 (dupacks by unidirectional data transmit).

Although my change didn't change the current behaviour without Sean's fix,
his fix doesn't apply cleanly to the code any more.

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Nachricht geschickt von/durch Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #3, bug #27445 (project lwip):

OK, combining the fix for this and bug #27329 was not as trivial as I hoped.
I've ended up with that bit of code reading as follows:


      if (pcb->snd_wl2 + pcb->snd_wnd == right_wnd_edge){
        ++pcb->dupacks;

        if (pcb->dupacks >= 3) {
          if (!(pcb->flags & TF_INFR) && pcb->unacked != NULL) {
            /* This is fast retransmit. Retransmit the first unacked segment.
*/
            LWIP_DEBUGF(TCP_FR_DEBUG, ("tcp_receive: dupacks %"U16_F"
(%"U32_F"), fast retransmit %"U32_F"n",
                                       (u16_t)pcb->dupacks, pcb->lastack,
                                       ntohl(pcb->unacked->tcphdr->seqno)));
            tcp_rexmit(pcb);
            /* Set ssthresh to max (FlightSize / 2, 2*SMSS) */
            /*pcb->ssthresh = LWIP_MAX((pcb->snd_max -
              pcb->lastack) / 2,
              2 * pcb->mss);*/
            /* Set ssthresh to half of the minimum of the current cwnd and
the advertised window */
            if (pcb->cwnd > pcb->snd_wnd)
              pcb->ssthresh = pcb->snd_wnd / 2;
            else
              pcb->ssthresh = pcb->cwnd / 2;
           
            /* The minimum value for ssthresh should be 2 MSS */
            if (pcb->ssthresh < 2*pcb->mss) {
              LWIP_DEBUGF(TCP_FR_DEBUG, ("tcp_receive: The minimum value for
ssthresh %"U16_F" should be min 2 mss %"U16_F"...n", pcb->ssthresh,
2*pcb->mss));
              pcb->ssthresh = 2*pcb->mss;
            }
           
            pcb->cwnd = pcb->ssthresh + 3 * pcb->mss;
            pcb->flags |= TF_INFR;
          } else {
            /* Inflate the congestion window, but not if it means that
               the value overflows. */
            if ((u16_t)(pcb->cwnd + pcb->mss) > pcb->cwnd) {
              pcb->cwnd += pcb->mss;
            }
          }
        }

        if (pcb->unacked == NULL && pcb->unsent == NULL)
          pcb->dupacks = 0;
      }


Doest that look OK?

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #4, bug #27445 (project lwip):

if ((u16_t)(pcb->cwnd + pcb->mss) > pcb->cwnd) { ... }

From a coding standpoint, the 2 variables are u16_t and the cast has no
effect.  From a runtime standpoint, is the cast always correct?  

If the addition could overflow, the test fails.  You should then cast to
u32_t.  But I don't think those values can ever cause an overflow.

Maybe I'm picky - when I see a cast I think "we're changing what the compiler
will normally do".  In this case we're not and at worse, could fail on an
overflow.

Bill

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #5, bug #27445 (project lwip):

Sorry, if you do cast to u32_t, it's:

if (((u32_t)pcb->cwnd + pcb->mss > pcb->cwnd) { ... }

Bill

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

Re: [bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Alain Mouette :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Bill Auerbach escreveu:

> Follow-up Comment #4, bug #27445 (project lwip):
>
> if ((u16_t)(pcb->cwnd + pcb->mss) > pcb->cwnd) { ... }
>
>>From a coding standpoint, the 2 variables are u16_t and the cast has no
> effect.  From a runtime standpoint, is the cast always correct?  
>
> If the addition could overflow, the test fails.  You should then cast to
> u32_t.  But I don't think those values can ever cause an overflow.
>
> Maybe I'm picky - when I see a cast I think "we're changing what the compiler
> will normally do".  In this case we're not and at worse, could fail on an
> overflow.

I allways cast to signed:
  if (( (int16_t)((pcb->cwnd + pcb->mss) - pcb->cwnd) ) >0) { ... }
allways with lots of parentesis just to make sure that no optimyser gets
in the way

This should allways work without using 32bits un-nessessarily

Alain


_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #6, bug #27445 (project lwip):

After looking at Alain's email, isn't:

if ((u16_t)(pcb->cwnd + pcb->mss) > pcb->cwnd) { ... }

the same as

if ((u16_t)(pcb->mss) > 0) { ... }

and isn't that always true?  If mss can be 0, then the second form is less
code.

Bill

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

RE: [bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Bill Auerbach :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>I allways cast to signed:
>  if (( (int16_t)((pcb->cwnd + pcb->mss) - pcb->cwnd) ) >0) { ... }
>allways with lots of parentesis just to make sure that no optimyser gets
>in the way
>
>This should allways work without using 32bits un-nessessarily

Not generically if 'pcb->cwnd + pcb->mss' overflows the maximum positive value a s16_t can hold.  Converting to signed supports half the range of the original expression.  Leaving it unsigned would allow it to work if cwnd was 10000 and mss was 50000.  But for this expression given, it is the same as this:

if ((u16_t)( pcb->mss) > 0) { ... }

Bill




_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #7, bug #27445 (project lwip):

>  From a runtime standpoint, is the cast always correct?
> If the addition could overflow, the test fails.

That's the point, as the comment above the line states.

The code looks OK to me.

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Nachricht geschickt von/durch Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #8, bug #27445 (project lwip):

Then you want:

if ((u16_t)(pcb->cwnd + pcb->mss) < pcb->cwnd) { ... }

If you add something to cwnd and it became less than the original cwnd, then
it overflowed, right?

The initial point that I was trying to show is that the cast is meaningless
and doesn't add anything.

Bill

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

Parent Message unknown Re: [bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by goldsimon@gmx.de :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Bill Auerbach wrote:

> Follow-up Comment #8, bug #27445 (project lwip):
>
> Then you want:
>
> if ((u16_t)(pcb->cwnd + pcb->mss) < pcb->cwnd) { ... }
>
> If you add something to cwnd and it became less than the original cwnd, then
> it overflowed, right?
>  
The code wants to do something if it *doesn't* overflow:

/* Inflate the congestion window, but not if it means that
   the value overflows. */
if ((u16_t)(pcb->cwnd + pcb->mss) > pcb->cwnd) {
  pcb->cwnd += pcb->mss;
}

> The initial point that I was trying to show is that the cast is meaningless
> and doesn't add anything.
>  
That's true, although that piece of code didn't get changed by what
Kieran suggested - it's been in there for some time. Still, the cast can
be removed, of course.

Simon




_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Update of bug #27445 (project lwip):

                  Status:                    None => Fixed                  
             Open/Closed:                    Open => Closed                

    _______________________________________________________

Follow-up Comment #9:

Fix checked in as described.

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #10, bug #27445 (project lwip):

Kieran, patch works fine.
But why increment of dupack and reset of dupack are in different places?
In this case we have small overhead:
 at first increment and than reset.

It could be better:
if (pcb->unacked == NULL && pcb->unsent == NULL) {
  pcb->dupacks = 0;
}
else {
  ++pcb->dupacks;
}


    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #11, bug #27445 (project lwip):

I didn't want to change the behavior of inflating the congestion window,
which is dependent on the value of dupacks.  I therefore opted to increase
dupacks, then do all the other things, then reset dupack if necessary.  If I
reset dupacks it at the start of that code block some of the other things
might not happen.

To be honest I'm still not sure the current code is right in this regard: we
only inflate the congestion window if dupacks > 3, and so will stop increasing
the congestion window once all the data have been acked and we reset dupacks
to zero, even though we are getting more acknowledgements telling us that
packets have left the network.

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #12, bug #27445 (project lwip):

OK, diff makes a bad job of displaying my changes, so here's what the code
block that deals with dupacks now looks like:


    /* (From Stevens TCP/IP Illustrated Vol II, p970.) Its only a
     * duplicate ack if:
     * 1) It doesn't ACK new data
     * 2) length of received packet is zero (i.e. no payload)
     * 3) the advertised window hasn't changed
     * 4) There is outstanding unacknowledged data (retransmission timer
running)
     * 5) The ACK is == biggest ACK sequence number so far seen (snd_una)
     *
     * If it passes all five, should process as a dupack:
     * a) dupacks < 3: do nothing
     * b) dupacks == 3: fast retransmit
     * c) dupacks > 3: increase cwnd
     *
     * If it only passes 1-3, should reset dupack counter (and add to
     * stats, which we don't do in lwIP)
     *
     * If it only passes 1, should reset dupack counter
     *
     */

    /* Clause 1 */
    if (TCP_SEQ_LEQ(ackno, pcb->lastack)) {
      pcb->acked = 0;
      /* Clause 2 */
      if (tcplen == 0) {
        /* Clause 3 */
        if (pcb->snd_wl2 + pcb->snd_wnd == right_wnd_edge){
          /* Clause 4 */
          if (pcb->rtime >= 0) {
            /* Clause 5 */
            if (pcb->lastack == ackno) {
              found_dupack = 1;
              if (pcb->dupacks + 1 > pcb->dupacks)
                ++pcb->dupacks;
              if (pcb->dupacks > 3) {
                /* Inflate the congestion window, but not if it means that
                   the value overflows. */
                if ((u16_t)(pcb->cwnd + pcb->mss) > pcb->cwnd) {
                  pcb->cwnd += pcb->mss;
                }
              } else if (pcb->dupacks == 3) {
                /* Do fast retransmit */
                tcp_rexmit_fast(pcb);
              }
            }
          }
        }
      }
      /* If Clause (1) or more is true, but not a duplicate ack, reset
       * count of consecutive duplicate acks */
      if (!found_dupack) {
        pcb->dupacks = 0;
      }
    } else if (TCP_SEQ_BETWEEN(ackno, pcb->lastack+1, pcb->snd_nxt)){
      /* We come here when the ACK acknowledges new data. */




    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

[bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Sylvain Beucler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #13, bug #27445 (project lwip):

Stevens - I forgot about this Bible-book!!!

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?27445>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel

Re: [bug #27445] PCB hangs in Fast Retransmit due to unchanging cwnd

by Kieran Mansley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, 2009-10-30 at 12:00 +0000, Oleg Tyshev wrote:
> Follow-up Comment #13, bug #27445 (project lwip):
>
> Stevens - I forgot about this Bible-book!!!

[replying to lwip-devel as this isn't relevant to the bug]

Stevens' books are now available and searchable via google books.

Kieran



_______________________________________________
lwip-devel mailing list
lwip-devel@...
http://lists.nongnu.org/mailman/listinfo/lwip-devel