netconn/netbuf api with receiving timeout missing closed connection

View: New views
9 Messages — Rating Filter:   Alert me  

netconn/netbuf api with receiving timeout missing closed connection

by Dmitri Snejko :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

I am trying  to  use netconn api with receiving timeout other  then 0. I
am using Lwip 1.3.0 with FreeRtos 5.4.2/ColdFire  and applied the patch
for 1.3.1

http://cvs.savannah.gnu.org/viewvc/lwip/src/include/lwip/err.h?root=lwip&r1=1.13&r2=1.14 <http://cvs.savannah.gnu.org/viewvc/lwip/src/include/lwip/err.h?root=lwip&r1=1.13&r2=1.14>

My application is a server listening for incoming connections, opening a
new netconn for any client (telnet like) and allocates a static buffer
from a pool to receive a stream. The receiving connection is not blocked
any more and I found I have a problem  when the connection is closed on
the remote side. My first impression was I could read netconn err field
when netconn_recv returns NULL and based on the error code decide if the
remote side closed the connection or a fatal error happened. That would
be a reason to close the connection on the sever side. It works fine for
some time but after a few hundred open/close cycles the server stops
seeing the other side closed its end.  I assume  netconn_recv returns
ERR_TIMEOUT which is not fatal. As result the buffer stays allocated and
I am running out resources.
The simplified code looks like one below:

listener->recv_timeout=1;
for(;;){
    new_conn = netconn_accept(listener);
    if(new_conn)  
        if((cb = find_free_cb()) == NULL){
            netconn_close(new_conn);
            netconn_delete(new_conn);
       }
       else{
            new_conn->recv_timeout=1;
            cb->conn = new_conn;
       }
    for(i = 0; i < MAX_CB; i++){
        if(cb_pool[i].conn != NULL)
            if( netbuff = netconn_recv(cb_pool[i].conn)  != NULL)
                read(netbuff, cb_pool[i]);
            else{
                  if(EER_IS_FATAL(cb_pool[i].conn->err)){
                        netconn_close(cb_pool[i].conn);
                        netconn_delete(cb_pool[i].conn);
                        cb_pool[i].conn = NULL;
                  }
           }
    }
}

Is this right way to proceed with netconns? Or I should  do something else?

Regards, Dmitri.

 


         


_______________________________________________
lwip-users mailing list
lwip-users@...
http://lists.nongnu.org/mailman/listinfo/lwip-users

Re: netconn/netbuf api with receiving timeout missing closed connection

by Kieran Mansley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 2009-10-24 at 00:29 -0400, Dmitri Snejko wrote:

> Hello,
>
> I am trying  to  use netconn api with receiving timeout other  then 0. I
> am using Lwip 1.3.0 with FreeRtos 5.4.2/ColdFire  and applied the patch
> for 1.3.1
>
> http://cvs.savannah.gnu.org/viewvc/lwip/src/include/lwip/err.h?root=lwip&r1=1.13&r2=1.14 <http://cvs.savannah.gnu.org/viewvc/lwip/src/include/lwip/err.h?root=lwip&r1=1.13&r2=1.14>
>
> My application is a server listening for incoming connections, opening a
> new netconn for any client (telnet like) and allocates a static buffer
> from a pool to receive a stream. The receiving connection is not blocked
> any more and I found I have a problem  when the connection is closed on
> the remote side. My first impression was I could read netconn err field
> when netconn_recv returns NULL and based on the error code decide if the
> remote side closed the connection or a fatal error happened. That would
> be a reason to close the connection on the sever side. It works fine for
> some time but after a few hundred open/close cycles the server stops
> seeing the other side closed its end.  I assume  netconn_recv returns
> ERR_TIMEOUT which is not fatal. As result the buffer stays allocated and
> I am running out resources.

There was a bug reported recently about the way that netconn_recv deals
with the netconn->err field:

http://savannah.nongnu.org/bugs/?27709

I hope that we will change the API for netconn_recv to return an error
like all the other netconn functions, rather than rely on the conn->err
field, but in the mean time you might like to try the partial fix
mentioned in that bug report.

Kieran



_______________________________________________
lwip-users mailing list
lwip-users@...
http://lists.nongnu.org/mailman/listinfo/lwip-users

Parent Message unknown Re: netconn/netbuf api with receiving timeout missing closed connection

by Dmitri Snejko :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


>> On Sat, 2009-10-24 at 00:29 -0400, Dmitri Snejko wrote:
>>/ Hello,/
>/ /
>/> I am trying  to  use netconn api with receiving timeout other  then 0. I /
>/> am using Lwip 1.3.0 with FreeRtos 5.4.2/ColdFire  and applied the patch /
>/> for 1.3.1/
>/> /
>/> http://cvs.savannah.gnu.org/viewvc/lwip/src/include/lwip/err.h?root=lwip&r1=1.13&r2=1.14 <http://cvs.savannah.gnu.org/viewvc/lwip/src/include/lwip/err.h?root=lwip&r1=1.13&r2=1.14>/
>/>  /
>/> <http://cvs.savannah.gnu.org/viewvc/lwip/src/include/lwip/err.h?root=lwip&r1=1.13&r2=1.14 <http://cvs.savannah.gnu.org/viewvc/lwip/src/include/lwip/err.h?root=lwip&r1=1.13&r2=1.14>>/
>/> /
>/> My application is a server listening for incoming connections, opening a /
>/> new netconn for any client (telnet like) and allocates a static buffer /
>/> from a pool to receive a stream. The receiving connection is not blocked /
>/> any more and I found I have a problem  when the connection is closed on /
>/> the remote side. My first impression was I could read netconn err field /
>/> when netconn_recv returns NULL and based on the error code decide if the /
>/> remote side closed the connection or a fatal error happened. That would /
>/> be a reason to close the connection on the sever side. It works fine for /
>/> some time but after a few hundred open/close cycles the server stops /
>/> seeing the other side closed its end.  I assume  netconn_recv returns /
>/> ERR_TIMEOUT which is not fatal. As result the buffer stays allocated and /
>/> I am running out resources./

> There was a bug reported recently about the way that netconn_recv deals
> with the netconn->err field:

>http://savannah.nongnu.org/bugs/?27709

> I hope that we will change the API for netconn_recv to return an error
> like all the other netconn functions, rather than rely on the conn->err
> field, but in the mean time you might like to try the partial fix
> mentioned in that bug report.

> Kieran

I tried 1.3.1 and the last api_lib.c from SVC with a fix for netconn_recv. It is still loosing closed conditions.
I think there is something  more then just racing conditions in on netconn->err. I found if i use 0 timeout  
closed remote side cached well but if i close the client side suddenly on the middle of transition LwIP won't recover and close  
the connection  with a timeout. It still keep it opened.
Socket API build on netconn as well is doing well in all situations.  

Dmitri.



_______________________________________________
lwip-users mailing list
lwip-users@...
http://lists.nongnu.org/mailman/listinfo/lwip-users

Re: netconn/netbuf api with receiving timeout missing closed connection

by Kieran Mansley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, 2009-10-30 at 23:27 -0400, Dmitri Snejko wrote:
> I tried 1.3.1 and the last api_lib.c from SVC with a fix for
> netconn_recv. It is still loosing closed conditions.
> I think there is something  more then just racing conditions in on
> netconn->err. I found if i use 0 timeout  
> closed remote side cached well but if i close the client side suddenly
> on the middle of transition LwIP won't recover and close  
> the connection  with a timeout. It still keep it opened.
> Socket API build on netconn as well is doing well in all situations.  

Can you get a packet capture to show the case where lwIP fails to spot
the closed connection?

Thanks

Kieran



_______________________________________________
lwip-users mailing list
lwip-users@...
http://lists.nongnu.org/mailman/listinfo/lwip-users

Re: netconn/netbuf api with receiving timeout missing closed connection

by Dmitri Snejko :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Kieran Mansley wrote:

> On Fri, 2009-10-30 at 23:27 -0400, Dmitri Snejko wrote:
>  
>> I tried 1.3.1 and the last api_lib.c from SVC with a fix for
>> netconn_recv. It is still loosing closed conditions.
>> I think there is something  more then just racing conditions in on
>> netconn->err. I found if i use 0 timeout  
>> closed remote side cached well but if i close the client side suddenly
>> on the middle of transition LwIP won't recover and close  
>> the connection  with a timeout. It still keep it opened.
>> Socket API build on netconn as well is doing well in all situations.  
>>    
>
> Can you get a packet capture to show the case where lwIP fails to spot
> the closed connection?
>
> Thanks
>
> Kieran
>
>
>
> _______________________________________________
> lwip-users mailing list
> lwip-users@...
> http://lists.nongnu.org/mailman/listinfo/lwip-users
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.5.423 / Virus Database: 270.14.45/2476 - Release Date: 11/02/09 07:51:00
>
>  
Hello Kieran,

I have found the reason why the closed connection was missed with
receiving timeout set to 0. It was my application problem. The remote
side application was closed and windows stack sent RST. Netconn_recv set
err to  ERR_RST. My application tried to close the netconn calling
netconn_close, netconn_delete. Netconn_close nether returned back
frizzing the receiving task. I removed netconn_close and left only
netconn_delete, never missed ERR_RST or ERR_CLSD any more.
For timeout set >0 lwip is still missing  closed connections but it
could be  the racing  you mentioned. I found the all  idea  doesn't work
well and the application should follow  sockets select way to be really
nonblocking on multiple netconns.
Thanks
Dmitri.



_______________________________________________
lwip-users mailing list
lwip-users@...
http://lists.nongnu.org/mailman/listinfo/lwip-users

Re: netconn/netbuf api with receiving timeout missing closed connection

by Kieran Mansley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 2009-11-03 at 23:39 -0500, Dmitri Snejko wrote:

> >  
> Hello Kieran,
>
> I have found the reason why the closed connection was missed with
> receiving timeout set to 0. It was my application problem. The remote
> side application was closed and windows stack sent RST. Netconn_recv set
> err to  ERR_RST. My application tried to close the netconn calling
> netconn_close, netconn_delete. Netconn_close nether returned back
> frizzing the receiving task.

That sounds like a bug.  Could you file a bug report on savannah?

> I removed netconn_close and left only
> netconn_delete, never missed ERR_RST or ERR_CLSD any more.
> For timeout set >0 lwip is still missing  closed connections but it
> could be  the racing  you mentioned. I found the all  idea  doesn't work
> well and the application should follow  sockets select way to be really
> nonblocking on multiple netconns.

That is the easiest way.

Kieran



_______________________________________________
lwip-users mailing list
lwip-users@...
http://lists.nongnu.org/mailman/listinfo/lwip-users

Re: netconn/netbuf api with receiving timeout missing closed connection

by Dmitri Snejko :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Kieran Mansley wrote:

> On Tue, 2009-11-03 at 23:39 -0500, Dmitri Snejko wrote:
>
>  
>>>  
>>>      
>> Hello Kieran,
>>
>> I have found the reason why the closed connection was missed with
>> receiving timeout set to 0. It was my application problem. The remote
>> side application was closed and windows stack sent RST. Netconn_recv set
>> err to  ERR_RST. My application tried to close the netconn calling
>> netconn_close, netconn_delete. Netconn_close nether returned back
>> frizzing the receiving task.
>>    
>
> That sounds like a bug.  Could you file a bug report on savannah?
>
>  
I traced it down. This lock  happens with  LWIP_TCPIP_CORE_LOCKING  
only.  When  netconn err  is ERR_RST  netconn->pcb.tcp is set to NULL .
Call back do_close:
void
do_close(struct api_msg_msg *msg)
{
#if LWIP_TCP
  if ((msg->conn->pcb.tcp != NULL) && (msg->conn->type == NETCONN_TCP)) {
      msg->conn->state = NETCONN_CLOSE;
      do_close_internal(msg->conn);
      /* for tcp netconns, do_close_internal ACKs the message */
  } else
#endif /* LWIP_TCP */
  {
    msg->conn->err = ERR_VAL;
    TCPIP_APIMSG_ACK(msg);
  }
}

in this case supposed to call macro TCPIP_APIMSG_ACK(msg) which is
defined empty  for   LWIP_TCPIP_CORE_LOCKING. As result the application
layer blocks on waiting  conn->op_completed semaphore.
Instead netconn_delete calls do_delconn which in the same situation does:

  if (msg->conn->op_completed != SYS_SEM_NULL) {
    sys_sem_signal(msg->conn->op_completed);

and survives. Should we call  it  a bug?    LWIP_TCPIP_CORE_LOCKING is
experimental.

Regards,
Dmitri,


_______________________________________________
lwip-users mailing list
lwip-users@...
http://lists.nongnu.org/mailman/listinfo/lwip-users

Re: netconn/netbuf api with receiving timeout missing closed connection

by Simon Goldschmidt :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Should we call  it  a bug?

Yes, although not a severe one (since CORE_LOCKING is experimental).

> LWIP_TCPIP_CORE_LOCKING is experimental.

But it can still be enabled by just a define. And once a bug is filed, others can see it, too, if they have the same problem.

Could you file a bug for it?

Simon
--
DSL-Preisknaller: DSL Komplettpakete von GMX schon für
16,99 Euro mtl.!* Hier klicken: http://portal.gmx.net/de/go/dsl02


_______________________________________________
lwip-users mailing list
lwip-users@...
http://lists.nongnu.org/mailman/listinfo/lwip-users

Re: netconn/netbuf api with receiving timeout missing closed connection

by Dmitri Snejko :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Simon Goldschmidt wrote:

>> Should we call  it  a bug?
>>    
>
> Yes, although not a severe one (since CORE_LOCKING is experimental).
>
>  
>> LWIP_TCPIP_CORE_LOCKING is experimental.
>>    
>
> But it can still be enabled by just a define. And once a bug is filed, others can see it, too, if they have the same problem.
>
> Could you file a bug for it?
>
> Simon
>  
Sure.
https://savannah.nongnu.org/bugs/index.php?27955
Regards,
Dmitri.


_______________________________________________
lwip-users mailing list
lwip-users@...
http://lists.nongnu.org/mailman/listinfo/lwip-users