[PATCH] Timeout support - Please review

View: New views
9 Messages — Rating Filter:   Alert me  

[PATCH] Timeout support - Please review

by colding :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Any and all occurrences of g_cond_wait() has the potential to block
forever if the remote server is physically disconnected or powered of.

The blocking behavior can be demonstrated (with pseudo code) like this:

1) Get a server object reference:

   CORBA_ORB orb = get_orb();
   char *objref = get_corbaloc_str();
   CORBA_Object obj = (CORBA_Object)CORBA_ORB_string_to_object(orb, objref, ev);


2) Invoke some method on the server object:

   CORBA_Environment ev[1];
   CORBA_exception_init(ev);
   FOO_INTERFACE_method(obj, <arguments>, ev);


The above operation will block forever in giop_recv_buffer_get() if the
remote server has been physically powered down. It immediately throws an
exception if the server object is merely dead or non-existent.

As a rule of thumb - Blocking forever is generally bad performance-wise
for the client application...


The appended patch therefore implements support for the ex_CORBA_TIMEOUT
system exception by replacing most occurrences of g_cond_wait() with
g_cond_timed_wait(). The only remaining instance of g_cond_wait() is
found in link_exec_command(). I haven't observed any problems with
link_exec_command() so I've not touched that code.

The affected code is:

1) New ORB option: GIOPTimeoutLimit - timeout limit in microseconds.
Defaults to 30 seconds.

2) giop_recv_buffer_get(). This place is obvious. We do not want to wait
indefinitely for data to be received from a downed host. The waiting
interval is configurable with the new GIOPTimeoutLimit ORB option.

3) Any code invoking link_wait():

   * link_connection_wait_connected_T()
   * link_connection_try_reconnect()
   * giop_connection_try_reconnect(), calls link_connection_try_reconnect()
   * ORBit_try_connection_T(), calls giop_connection_try_reconnect()
   * ORBit_object_get_connection(), calls ORBit_try_connection_T()

4) The waiting time in link_wait() is set at compile time to
LINK_WAIT_TIMEOUT_USEC which is presently 10 seconds.

5) link_wait() has been modified to return a gboolean. FALSE if the
timeout expired, TRUE otherwise.

6) link_connection_wait_connected_T() and
link_connection_try_reconnect() will both disconnect the connection if
link_wait() experienced a timeout (returned FALSE).

Please review and test this patch. I'm not entirely convinced that I
haven't introduced a bad bug or two but at least it works(*) here.


Thanks,
  jules

(*) Tested with evolution-brutus on a dual Opteron Gentoo.


Index: ChangeLog
===================================================================
RCS file: /cvs/gnome/ORBit2/ChangeLog,v
retrieving revision 1.778
diff -u -p -r1.778 ChangeLog
--- ChangeLog 22 Nov 2006 20:34:39 -0000 1.778
+++ ChangeLog 4 Dec 2006 13:20:48 -0000
@@ -1,3 +1,25 @@
+2006-12-04  Jules Colding  <colding@...>
+
+ * configure.in: Added autoconf 2.60+ required datarootdir variable
+
+2006-12-03  Jules Colding  <colding@...>
+
+ * include/orbit/GIOP/giop.h: Declare giop_recv_set_timeout().
+
+ * src/orb/orb-core/corba-orb.c (CORBA_ORB_init): Set GIOP timeout
+ limit from the ORB option.
+ Added new ORB option - GIOPTimeoutLimit.
+
+ * src/orb/orb-core/orbit-small.c (ORBit_small_invoke_stub): Throw a
+ TIMEOUT exception if a reply hasn't been recieved within the GIOP
+ timeout limit.
+
+ * src/orb/GIOP/giop-recv-buffer.c (giop_recv_buffer_get): Replaced
+ a g_cond_wait() with a g_cond_timed_wait(). The original g_cond_wait()
+ could potentially block forever if a remote server was physically
+ offline.
+ (giop_recv_set_timeout): Function to adjust the GIOP timeout limit.
+
 2006-11-22  Kjartan Maraas  <kmaraas@...>
 
  * ORBit-2.0.pc.in: Move MINGW_CFLAGS and LIBM to Libs.private
Index: configure.in
===================================================================
RCS file: /cvs/gnome/ORBit2/configure.in,v
retrieving revision 1.175
diff -u -p -r1.175 configure.in
--- configure.in 22 Nov 2006 20:34:41 -0000 1.175
+++ configure.in 4 Dec 2006 13:20:48 -0000
@@ -37,6 +37,9 @@ AM_CONFIG_HEADER(config.h)
 dnl Initialize automake stuff
 AM_INIT_AUTOMAKE(ORBit2, $ORBIT_VERSION, no-define)
 
+dnl Required by autoconf 2.60
+AC_SUBST(datarootdir)
+
 AC_CANONICAL_HOST
 AC_MSG_CHECKING([for Win32])
 case "$host" in
Index: include/orbit/GIOP/giop-recv-buffer.h
===================================================================
RCS file: /cvs/gnome/ORBit2/include/orbit/GIOP/giop-recv-buffer.h,v
retrieving revision 1.33
diff -u -p -r1.33 giop-recv-buffer.h
--- include/orbit/GIOP/giop-recv-buffer.h 14 Jan 2004 11:04:01 -0000 1.33
+++ include/orbit/GIOP/giop-recv-buffer.h 4 Dec 2006 13:20:48 -0000
@@ -58,7 +58,8 @@ void            giop_recv_list_setup_que
 void            giop_recv_list_setup_queue_entry_async (GIOPMessageQueueEntry *ent,
  GIOPAsyncCallback      cb);
 
-GIOPRecvBuffer *giop_recv_buffer_get               (GIOPMessageQueueEntry *ent);
+GIOPRecvBuffer *giop_recv_buffer_get               (GIOPMessageQueueEntry *ent,
+    gboolean *timeout);
 void            giop_recv_buffer_unuse             (GIOPRecvBuffer        *buf);
 
 #define giop_recv_buffer_reply_status(buf) (                                            \
Index: include/orbit/GIOP/giop-types.h
===================================================================
RCS file: /cvs/gnome/ORBit2/include/orbit/GIOP/giop-types.h,v
retrieving revision 1.21
diff -u -p -r1.21 giop-types.h
--- include/orbit/GIOP/giop-types.h 27 Oct 2003 16:14:12 -0000 1.21
+++ include/orbit/GIOP/giop-types.h 4 Dec 2006 13:20:48 -0000
@@ -35,7 +35,9 @@ struct _GIOPThread {
  gpointer dummy);
 };
 
-#define GIOP_INITIAL_MSG_SIZE_LIMIT 256*1024
+#define GIOP_INITIAL_TIMEOUT_LIMIT (30000000) /* 30 seconds */
+
+#define GIOP_INITIAL_MSG_SIZE_LIMIT (256*1024) /* in bytes */
 
 typedef enum {
  GIOP_CONNECTION_SSL
Index: include/orbit/GIOP/giop.h
===================================================================
RCS file: /cvs/gnome/ORBit2/include/orbit/GIOP/giop.h,v
retrieving revision 1.24
diff -u -p -r1.24 giop.h
--- include/orbit/GIOP/giop.h 5 Apr 2006 07:16:10 -0000 1.24
+++ include/orbit/GIOP/giop.h 4 Dec 2006 13:20:48 -0000
@@ -23,6 +23,7 @@ gboolean    giop_thread_safe       (void
 gboolean    giop_thread_io         (void);
 GIOPThread *giop_thread_self       (void);
 void        giop_invoke_async      (GIOPMessageQueueEntry *ent);
+void        giop_recv_set_timeout  (const glong timeout);
 void        giop_recv_set_limit    (glong limit);
 glong       giop_recv_get_limit    (void);
 void        giop_incoming_signal_T (GIOPThread *tdata, GIOPMsgType t);
@@ -45,6 +46,7 @@ void        giop_thread_new_check      
 void        giop_thread_queue_process    (GIOPThread *tdata);
 gboolean    giop_thread_queue_empty_T    (GIOPThread *tdata);
 void        giop_thread_queue_tail_wakeup(GIOPThread *tdata);
+
 
 #endif /* ORBIT2_INTERNAL_API */
 
Index: linc2/ChangeLog
===================================================================
RCS file: /cvs/gnome/ORBit2/linc2/ChangeLog,v
retrieving revision 1.262
diff -u -p -r1.262 ChangeLog
--- linc2/ChangeLog 22 Nov 2006 19:13:41 -0000 1.262
+++ linc2/ChangeLog 4 Dec 2006 13:20:49 -0000
@@ -1,3 +1,8 @@
+2006-12-04  Jules Colding  <colding@...>
+
+ * src/linc.c (link_wait): Do not wait forever for the link
+ condition to get signaled.
+
 2006-11-22  Kjartan Maraas  <kmaraas@...>
 
  * src/Makefile.am: Remove SSL_LIBS and SSL_CFLAGS.
Index: linc2/include/linc/linc.h
===================================================================
RCS file: /cvs/gnome/ORBit2/linc2/include/linc/linc.h,v
retrieving revision 1.17
diff -u -p -r1.17 linc.h
--- linc2/include/linc/linc.h 28 Jan 2005 12:34:57 -0000 1.17
+++ linc2/include/linc/linc.h 4 Dec 2006 13:20:49 -0000
@@ -33,7 +33,7 @@ GMainLoop *link_main_get_loop    (void);
 guint      link_main_idle_add    (GSourceFunc function,
   gpointer    data);
 
-void       link_wait             (void);
+gboolean   link_wait             (void);
 void       link_signal           (void);
 
 gboolean   link_thread_io        (void);
Index: linc2/src/linc-connection.c
===================================================================
RCS file: /cvs/gnome/ORBit2/linc2/src/linc-connection.c,v
retrieving revision 1.117
diff -u -p -r1.117 linc-connection.c
--- linc2/src/linc-connection.c 9 Sep 2006 07:54:37 -0000 1.117
+++ linc2/src/linc-connection.c 4 Dec 2006 13:20:49 -0000
@@ -635,8 +635,10 @@ link_connection_do_initiate (LinkConnect
 static LinkConnectionStatus
 link_connection_wait_connected_T (LinkConnection *cnx)
 {
- while (cnx && cnx->status == LINK_CONNECTING)
- link_wait ();
+ while (cnx && cnx->status == LINK_CONNECTING) {
+ if (!link_wait ())
+ link_connection_disconnect (cnx);
+ }
 
  return cnx ? cnx->status : LINK_DISCONNECTED;
 }
@@ -659,8 +661,12 @@ link_connection_try_reconnect (LinkConne
  cnx->inhibit_reconnect = FALSE;
  dispatch_callbacks_drop_lock (cnx);
  g_main_context_release (NULL);
- } else
- link_wait ();
+ } else {
+ if (!link_wait ()) {
+ link_connection_disconnect (cnx);
+ break;
+ }
+ }
  }
 
  if (cnx->status != LINK_DISCONNECTED)
Index: linc2/src/linc.c
===================================================================
RCS file: /cvs/gnome/ORBit2/linc2/src/linc.c,v
retrieving revision 1.63
diff -u -p -r1.63 linc.c
--- linc2/src/linc.c 2 Jun 2006 08:14:22 -0000 1.63
+++ linc2/src/linc.c 4 Dec 2006 13:20:49 -0000
@@ -52,6 +52,9 @@ SSL_METHOD *link_ssl_method;
 SSL_CTX    *link_ssl_ctx;
 #endif
 
+/* max time to wait for the link condition to get signaled - 10 seconds */
+#define LINK_WAIT_TIMEOUT_USEC (10000000)
+
 static void link_dispatch_command (gpointer data, gboolean immediate);
 
 gboolean
@@ -534,17 +537,28 @@ link_signal (void)
  }
 }
 
-void
+gboolean
 link_wait (void)
 {
+ GTimeVal gtime;
+
  if (!(link_is_thread_safe && link_is_io_in_thread)) {
  link_unlock ();
  link_main_iteration (TRUE);
  link_lock ();
  } else {
  g_assert (link_main_cond != NULL);
- g_cond_wait (link_main_cond, link_main_lock);
+
+ g_get_current_time (>ime);
+ g_time_val_add (>ime, LINK_WAIT_TIMEOUT_USEC);
+ if (!g_cond_timed_wait (link_main_cond, link_main_lock, >ime)) {
+ if (link_is_locked ())
+ link_unlock ();
+ return FALSE;
+ }
  }
+
+ return TRUE;
 }
 
 
Index: src/orb/GIOP/giop-recv-buffer.c
===================================================================
RCS file: /cvs/gnome/ORBit2/src/orb/GIOP/giop-recv-buffer.c,v
retrieving revision 1.81
diff -u -p -r1.81 giop-recv-buffer.c
--- src/orb/GIOP/giop-recv-buffer.c 5 Apr 2006 07:17:17 -0000 1.81
+++ src/orb/GIOP/giop-recv-buffer.c 4 Dec 2006 13:20:50 -0000
@@ -690,10 +690,21 @@ check_got (GIOPMessageQueueEntry *ent)
  (ent->cnx->parent.status == LINK_DISCONNECTED));
 }
 
+static glong giop_initial_timeout_limit = GIOP_INITIAL_TIMEOUT_LIMIT;
+
+void
+giop_recv_set_timeout (const glong timeout)
+{
+ if (0 < timeout) /* We really do not want (timeout <= 0) as that would potentially block forever */
+ giop_initial_timeout_limit = timeout;
+}
+
 GIOPRecvBuffer *
-giop_recv_buffer_get (GIOPMessageQueueEntry *ent)
+giop_recv_buffer_get (GIOPMessageQueueEntry *ent,
+      gboolean *timeout)
 {
  GIOPThread *tdata = giop_thread_self ();
+ GTimeVal tval;
 
  thread_switch:
  if (giop_thread_io ()) {
@@ -704,8 +715,17 @@ giop_recv_buffer_get (GIOPMessageQueueEn
  ent_unlock (ent);
  giop_thread_queue_process (tdata);
  ent_lock (ent);
- } else
- g_cond_wait (tdata->incoming, tdata->lock);
+ } else {
+ if (0 < giop_initial_timeout_limit) {
+ g_get_current_time (&tval);
+ g_time_val_add (&tval, giop_initial_timeout_limit);
+ }
+ if (!g_cond_timed_wait (tdata->incoming, tdata->lock, ((0 < giop_initial_timeout_limit) ? &tval : NULL))) {
+ *timeout = TRUE;
+ break;
+ } else
+ *timeout = FALSE;
+ }
  }
 
  ent_unlock (ent);
@@ -1352,3 +1372,4 @@ giop_recv_buffer_use_buf (GIOPConnection
 
  return buf;
 }
+
Index: src/orb/orb-core/corba-orb.c
===================================================================
RCS file: /cvs/gnome/ORBit2/src/orb/orb-core/corba-orb.c,v
retrieving revision 1.123
diff -u -p -r1.123 corba-orb.c
--- src/orb/orb-core/corba-orb.c 16 Aug 2006 09:28:30 -0000 1.123
+++ src/orb/orb-core/corba-orb.c 4 Dec 2006 13:20:51 -0000
@@ -61,7 +61,7 @@ static char        *orbit_debug_options
 static char        *orbit_naming_ref         = NULL;
 static GSList      *orbit_initref_list       = NULL;
 static gboolean     orbit_use_corbaloc       = FALSE;
-
+static gint         orbit_timeout_limit      = -1;
 void
 ORBit_ORB_start_servers (CORBA_ORB orb)
 {
@@ -417,8 +417,8 @@ CORBA_ORB_init (int *argc, char **argv,
  }
 #endif /* G_ENABLE_DEBUG */
 
-
  giop_recv_set_limit (orbit_initial_recv_limit);
+ giop_recv_set_timeout (orbit_timeout_limit);
  giop_init (thread_safe,
    orbit_use_ipv4 || orbit_use_ipv6 ||
    orbit_use_irda || orbit_use_ssl);
@@ -1467,6 +1467,7 @@ const ORBit_option orbit_supported_optio
  { "ORBDebugFlags",      ORBIT_OPTION_STRING,  &orbit_debug_options },
  { "ORBInitRef",         ORBIT_OPTION_KEY_VALUE,  &orbit_initref_list},
  { "ORBCorbaloc",        ORBIT_OPTION_BOOLEAN, &orbit_use_corbaloc},
+ { "GIOPTimeoutLimit",   ORBIT_OPTION_INT,     &orbit_timeout_limit },
  { NULL,                 0,                    NULL },
 };
 
Index: src/orb/orb-core/orbit-small.c
===================================================================
RCS file: /cvs/gnome/ORBit2/src/orb/orb-core/orbit-small.c,v
retrieving revision 1.97
diff -u -p -r1.97 orbit-small.c
--- src/orb/orb-core/orbit-small.c 18 May 2006 14:20:49 -0000 1.97
+++ src/orb/orb-core/orbit-small.c 4 Dec 2006 13:20:51 -0000
@@ -591,6 +591,7 @@ ORBit_small_invoke_stub (CORBA_Object  
  GIOPRecvBuffer         *recv_buffer = NULL;
  CORBA_Object            xt_proxy = CORBA_OBJECT_NIL;
  ORBitPolicy            *invoke_policy = CORBA_OBJECT_NIL;
+ gboolean                timeout = FALSE;
 
  if (!obj) {
  dprintf (MESSAGES, "Cannot invoke method on null object\n");
@@ -654,7 +655,9 @@ ORBit_small_invoke_stub (CORBA_Object  
  goto clean_out;
  }
 
- recv_buffer = giop_recv_buffer_get (&mqe);
+ recv_buffer = giop_recv_buffer_get (&mqe, &timeout);
+ if (timeout)
+ goto timeout_exception;
 
  switch (orbit_small_demarshal (obj, &cnx, recv_buffer, ev,
        ret, m_data, args))
@@ -698,6 +701,12 @@ ORBit_small_invoke_stub (CORBA_Object  
  tprintf ("[System exception comm failure] )");
  CORBA_exception_set_system (ev, ex_CORBA_COMM_FAILURE,
     completion_status);
+ goto clean_out;
+
+ timeout_exception:
+ tprintf ("[System exception timeout] )");
+ CORBA_exception_set_system (ev, ex_CORBA_TIMEOUT,
+    CORBA_COMPLETED_NO);
  goto clean_out;
 }
 


_______________________________________________
orbit-list mailing list
orbit-list@...
http://mail.gnome.org/mailman/listinfo/orbit-list

Re: [PATCH] Timeout support - Please review

by colding :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, 2006-12-04 at 14:22 +0100, Jules Colding wrote:
> Please review and test this patch. I'm not entirely convinced that I
> haven't introduced a bad bug or two but at least it works(*) here.

No responses yet, so it seems that my patch is of the usual flawless
quality ;-)

Anyway, I'll commit tomorrow if there is no objections.

Best regards,
  jules


_______________________________________________
orbit-list mailing list
orbit-list@...
http://mail.gnome.org/mailman/listinfo/orbit-list

Re: [PATCH] Timeout support - Please review

by Gustavo J. A. M. Carneiro :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Ter, 2006-12-05 at 09:50 +0100, Jules Colding wrote:
> On Mon, 2006-12-04 at 14:22 +0100, Jules Colding wrote:
> > Please review and test this patch. I'm not entirely convinced that I
> > haven't introduced a bad bug or two but at least it works(*) here.
>
> No responses yet, so it seems that my patch is of the usual flawless
> quality ;-)

  Heh :)
 
  In fact, the patch sounds really nice, a sign of ORBit2 maturing for
TCP/IP connections (which is traditionally seldom tested).

  My only comment is that the changelog entry could use a bit more
detail.  You only mention changes in one function, but the patch changes
4 functions and even adds a new commandline option which deserves a
mention in the changelog too.

  Cheers,

--
Gustavo J. A. M. Carneiro
<gjc@...> <gustavo@...>
The universe is always one step beyond logic.

_______________________________________________
orbit-list mailing list
orbit-list@...
http://mail.gnome.org/mailman/listinfo/orbit-list

Re: [PATCH] Timeout support - Please review

by colding :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 2006-12-05 at 10:47 +0000, Gustavo J. A. M. Carneiro wrote:

> On Ter, 2006-12-05 at 09:50 +0100, Jules Colding wrote:
> > On Mon, 2006-12-04 at 14:22 +0100, Jules Colding wrote:
> > > Please review and test this patch. I'm not entirely convinced that I
> > > haven't introduced a bad bug or two but at least it works(*) here.
> >
> > No responses yet, so it seems that my patch is of the usual flawless
> > quality ;-)
>
>   Heh :)
>  
>   In fact, the patch sounds really nice, a sign of ORBit2 maturing for
> TCP/IP connections (which is traditionally seldom tested).
>
>   My only comment is that the changelog entry could use a bit more
> detail.  You only mention changes in one function, but the patch changes
> 4 functions and even adds a new commandline option which deserves a
> mention in the changelog too.

OK, I'll expand the ChangeLog and resend the patch :-)


Best regards,
  jules


_______________________________________________
orbit-list mailing list
orbit-list@...
http://mail.gnome.org/mailman/listinfo/orbit-list

Re: [PATCH] Timeout support - Please review

by colding :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 2006-12-05 at 10:47 +0000, Gustavo J. A. M. Carneiro wrote:

> On Ter, 2006-12-05 at 09:50 +0100, Jules Colding wrote:
> > On Mon, 2006-12-04 at 14:22 +0100, Jules Colding wrote:
> > > Please review and test this patch. I'm not entirely convinced that I
> > > haven't introduced a bad bug or two but at least it works(*) here.
> >
> > No responses yet, so it seems that my patch is of the usual flawless
> > quality ;-)
>
>   Heh :)
>  
>   In fact, the patch sounds really nice, a sign of ORBit2 maturing for
> TCP/IP connections (which is traditionally seldom tested).
>
>   My only comment is that the changelog entry could use a bit more
> detail.  You only mention changes in one function, but the patch changes
> 4 functions and even adds a new commandline option which deserves a
> mention in the changelog too.

Here is the new and better documented patch.

Best regards,
  jules


Index: ChangeLog
===================================================================
RCS file: /cvs/gnome/ORBit2/ChangeLog,v
retrieving revision 1.778
diff -u -p -r1.778 ChangeLog
--- ChangeLog 22 Nov 2006 20:34:39 -0000 1.778
+++ ChangeLog 5 Dec 2006 12:04:04 -0000
@@ -1,3 +1,95 @@
+2006-12-05  Jules Colding  <colding@...>
+
+ * ORBit2: The previous two ChangeLog entries are hiding a lot of
+ details which I'll hereby try to expand upon.
+
+ The rationale for these changes is that any and all occurrences of
+ g_cond_wait() in ORBit2 has the potential to block forever if it
+ waits for a connection attempt to complete if said connection
+ attempt is made towards a remote server which happens to be
+ physically disconnected or powered off.
+
+ This blocking behavior can be demonstrated by invoking an operation
+ on a remote object that is physically inaccessible such as when
+ the remote server is powered down or physically disconnected.
+
+ Pseudo code:
+
+ {
+    CORBA_ORB orb = get_orb();
+    char *objref = get_corbaloc_str();
+    CORBA_Object obj = CORBA_OBJECT_NIL;
+    CORBA_Environment ev[1];
+
+    CORBA_exception_init(ev);
+    obj = CORBA_ORB_string_to_object(orb, objref, ev);
+
+    FOO_INTERFACE_method(obj, <arguments>, ev);
+ }
+
+ The last statement above will block forever in g_cond_wait() while
+ waiting for the recieve buffer to be ready if the remote server is,
+ say, powered down. An exception is on the other hand immediately
+ thrown if the server object is merely gone.
+
+ The changes to the code therefore boils down to the implementation
+ of the ex_CORBA_TIMEOUT system exception being thrown if
+ g_cond_timed_wait() times out. Most occurrences of g_cond_wait()
+ has been replaced by g_cond_timed_wait().
+
+ The only remaining instance of g_cond_wait() is found in
+ link_exec_command(). I haven't observed any problems with
+ link_exec_command() so I've not touched that code.
+
+ The affected code is:
+
+ 1) A new ORB command line option: GIOPTimeoutLimit - timeout limit
+ in microseconds. Defaults to 30 seconds.
+
+ 2) giop_recv_buffer_get(). This one is obvious. We do not want to
+ wait indefinitely for data to be received from a downed host. The
+ waiting interval is configurable with the new GIOPTimeoutLimit ORB
+ option.
+
+ 3) Any code invoking link_wait():
+
+   a) link_connection_wait_connected_T()
+   b) link_connection_try_reconnect()
+   c) giop_connection_try_reconnect(), calls link_connection_try_reconnect()
+   d) ORBit_try_connection_T(), calls giop_connection_try_reconnect()
+   e) ORBit_object_get_connection(), calls ORBit_try_connection_T()
+
+ 4) The waiting time in link_wait() is set at compile time to
+ LINK_WAIT_TIMEOUT_USEC which is presently 10 seconds.
+
+ 5) link_wait() has been modified to return a gboolean. FALSE if the
+ timeout expired, TRUE otherwise.
+
+ 6) link_connection_wait_connected_T() and link_connection_try_reconnect()
+ will both disconnect the connection if link_wait() experienced a timeout.
+
+2006-12-04  Jules Colding  <colding@...>
+
+ * configure.in: Added autoconf 2.60+ required datarootdir variable
+
+2006-12-03  Jules Colding  <colding@...>
+
+ * include/orbit/GIOP/giop.h: Declare giop_recv_set_timeout().
+
+ * src/orb/orb-core/corba-orb.c (CORBA_ORB_init): Set GIOP timeout
+ limit from the ORB option.
+ Added new ORB option - GIOPTimeoutLimit.
+
+ * src/orb/orb-core/orbit-small.c (ORBit_small_invoke_stub): Throw a
+ TIMEOUT exception if a reply hasn't been recieved within the GIOP
+ timeout limit.
+
+ * src/orb/GIOP/giop-recv-buffer.c (giop_recv_buffer_get): Replaced
+ a g_cond_wait() with a g_cond_timed_wait(). The original g_cond_wait()
+ could potentially block forever if a remote server was physically
+ offline.
+ (giop_recv_set_timeout): Function to adjust the GIOP timeout limit.
+
 2006-11-22  Kjartan Maraas  <kmaraas@...>
 
  * ORBit-2.0.pc.in: Move MINGW_CFLAGS and LIBM to Libs.private
Index: configure.in
===================================================================
RCS file: /cvs/gnome/ORBit2/configure.in,v
retrieving revision 1.175
diff -u -p -r1.175 configure.in
--- configure.in 22 Nov 2006 20:34:41 -0000 1.175
+++ configure.in 5 Dec 2006 12:04:04 -0000
@@ -37,6 +37,9 @@ AM_CONFIG_HEADER(config.h)
 dnl Initialize automake stuff
 AM_INIT_AUTOMAKE(ORBit2, $ORBIT_VERSION, no-define)
 
+dnl Required by autoconf 2.60
+AC_SUBST(datarootdir)
+
 AC_CANONICAL_HOST
 AC_MSG_CHECKING([for Win32])
 case "$host" in
Index: include/orbit/GIOP/giop-recv-buffer.h
===================================================================
RCS file: /cvs/gnome/ORBit2/include/orbit/GIOP/giop-recv-buffer.h,v
retrieving revision 1.33
diff -u -p -r1.33 giop-recv-buffer.h
--- include/orbit/GIOP/giop-recv-buffer.h 14 Jan 2004 11:04:01 -0000 1.33
+++ include/orbit/GIOP/giop-recv-buffer.h 5 Dec 2006 12:04:04 -0000
@@ -58,7 +58,8 @@ void            giop_recv_list_setup_que
 void            giop_recv_list_setup_queue_entry_async (GIOPMessageQueueEntry *ent,
  GIOPAsyncCallback      cb);
 
-GIOPRecvBuffer *giop_recv_buffer_get               (GIOPMessageQueueEntry *ent);
+GIOPRecvBuffer *giop_recv_buffer_get               (GIOPMessageQueueEntry *ent,
+    gboolean *timeout);
 void            giop_recv_buffer_unuse             (GIOPRecvBuffer        *buf);
 
 #define giop_recv_buffer_reply_status(buf) (                                            \
Index: include/orbit/GIOP/giop-types.h
===================================================================
RCS file: /cvs/gnome/ORBit2/include/orbit/GIOP/giop-types.h,v
retrieving revision 1.21
diff -u -p -r1.21 giop-types.h
--- include/orbit/GIOP/giop-types.h 27 Oct 2003 16:14:12 -0000 1.21
+++ include/orbit/GIOP/giop-types.h 5 Dec 2006 12:04:04 -0000
@@ -35,7 +35,9 @@ struct _GIOPThread {
  gpointer dummy);
 };
 
-#define GIOP_INITIAL_MSG_SIZE_LIMIT 256*1024
+#define GIOP_INITIAL_TIMEOUT_LIMIT (30000000) /* 30 seconds */
+
+#define GIOP_INITIAL_MSG_SIZE_LIMIT (256*1024) /* in bytes */
 
 typedef enum {
  GIOP_CONNECTION_SSL
Index: include/orbit/GIOP/giop.h
===================================================================
RCS file: /cvs/gnome/ORBit2/include/orbit/GIOP/giop.h,v
retrieving revision 1.24
diff -u -p -r1.24 giop.h
--- include/orbit/GIOP/giop.h 5 Apr 2006 07:16:10 -0000 1.24
+++ include/orbit/GIOP/giop.h 5 Dec 2006 12:04:04 -0000
@@ -23,6 +23,7 @@ gboolean    giop_thread_safe       (void
 gboolean    giop_thread_io         (void);
 GIOPThread *giop_thread_self       (void);
 void        giop_invoke_async      (GIOPMessageQueueEntry *ent);
+void        giop_recv_set_timeout  (const glong timeout);
 void        giop_recv_set_limit    (glong limit);
 glong       giop_recv_get_limit    (void);
 void        giop_incoming_signal_T (GIOPThread *tdata, GIOPMsgType t);
@@ -45,6 +46,7 @@ void        giop_thread_new_check      
 void        giop_thread_queue_process    (GIOPThread *tdata);
 gboolean    giop_thread_queue_empty_T    (GIOPThread *tdata);
 void        giop_thread_queue_tail_wakeup(GIOPThread *tdata);
+
 
 #endif /* ORBIT2_INTERNAL_API */
 
Index: linc2/ChangeLog
===================================================================
RCS file: /cvs/gnome/ORBit2/linc2/ChangeLog,v
retrieving revision 1.262
diff -u -p -r1.262 ChangeLog
--- linc2/ChangeLog 22 Nov 2006 19:13:41 -0000 1.262
+++ linc2/ChangeLog 5 Dec 2006 12:04:05 -0000
@@ -1,3 +1,8 @@
+2006-12-04  Jules Colding  <colding@...>
+
+ * src/linc.c (link_wait): Do not wait forever for the link
+ condition to get signaled.
+
 2006-11-22  Kjartan Maraas  <kmaraas@...>
 
  * src/Makefile.am: Remove SSL_LIBS and SSL_CFLAGS.
Index: linc2/include/linc/linc.h
===================================================================
RCS file: /cvs/gnome/ORBit2/linc2/include/linc/linc.h,v
retrieving revision 1.17
diff -u -p -r1.17 linc.h
--- linc2/include/linc/linc.h 28 Jan 2005 12:34:57 -0000 1.17
+++ linc2/include/linc/linc.h 5 Dec 2006 12:04:05 -0000
@@ -33,7 +33,7 @@ GMainLoop *link_main_get_loop    (void);
 guint      link_main_idle_add    (GSourceFunc function,
   gpointer    data);
 
-void       link_wait             (void);
+gboolean   link_wait             (void);
 void       link_signal           (void);
 
 gboolean   link_thread_io        (void);
Index: linc2/src/linc-connection.c
===================================================================
RCS file: /cvs/gnome/ORBit2/linc2/src/linc-connection.c,v
retrieving revision 1.117
diff -u -p -r1.117 linc-connection.c
--- linc2/src/linc-connection.c 9 Sep 2006 07:54:37 -0000 1.117
+++ linc2/src/linc-connection.c 5 Dec 2006 12:04:06 -0000
@@ -635,8 +635,10 @@ link_connection_do_initiate (LinkConnect
 static LinkConnectionStatus
 link_connection_wait_connected_T (LinkConnection *cnx)
 {
- while (cnx && cnx->status == LINK_CONNECTING)
- link_wait ();
+ while (cnx && cnx->status == LINK_CONNECTING) {
+ if (!link_wait ())
+ link_connection_disconnect (cnx);
+ }
 
  return cnx ? cnx->status : LINK_DISCONNECTED;
 }
@@ -659,8 +661,12 @@ link_connection_try_reconnect (LinkConne
  cnx->inhibit_reconnect = FALSE;
  dispatch_callbacks_drop_lock (cnx);
  g_main_context_release (NULL);
- } else
- link_wait ();
+ } else {
+ if (!link_wait ()) {
+ link_connection_disconnect (cnx);
+ break;
+ }
+ }
  }
 
  if (cnx->status != LINK_DISCONNECTED)
Index: linc2/src/linc.c
===================================================================
RCS file: /cvs/gnome/ORBit2/linc2/src/linc.c,v
retrieving revision 1.63
diff -u -p -r1.63 linc.c
--- linc2/src/linc.c 2 Jun 2006 08:14:22 -0000 1.63
+++ linc2/src/linc.c 5 Dec 2006 12:04:06 -0000
@@ -52,6 +52,9 @@ SSL_METHOD *link_ssl_method;
 SSL_CTX    *link_ssl_ctx;
 #endif
 
+/* max time to wait for the link condition to get signaled - 10 seconds */
+#define LINK_WAIT_TIMEOUT_USEC (10000000)
+
 static void link_dispatch_command (gpointer data, gboolean immediate);
 
 gboolean
@@ -534,17 +537,28 @@ link_signal (void)
  }
 }
 
-void
+gboolean
 link_wait (void)
 {
+ GTimeVal gtime;
+
  if (!(link_is_thread_safe && link_is_io_in_thread)) {
  link_unlock ();
  link_main_iteration (TRUE);
  link_lock ();
  } else {
  g_assert (link_main_cond != NULL);
- g_cond_wait (link_main_cond, link_main_lock);
+
+ g_get_current_time (>ime);
+ g_time_val_add (>ime, LINK_WAIT_TIMEOUT_USEC);
+ if (!g_cond_timed_wait (link_main_cond, link_main_lock, >ime)) {
+ if (link_is_locked ())
+ link_unlock ();
+ return FALSE;
+ }
  }
+
+ return TRUE;
 }
 
 
Index: src/orb/GIOP/giop-recv-buffer.c
===================================================================
RCS file: /cvs/gnome/ORBit2/src/orb/GIOP/giop-recv-buffer.c,v
retrieving revision 1.81
diff -u -p -r1.81 giop-recv-buffer.c
--- src/orb/GIOP/giop-recv-buffer.c 5 Apr 2006 07:17:17 -0000 1.81
+++ src/orb/GIOP/giop-recv-buffer.c 5 Dec 2006 12:04:07 -0000
@@ -690,10 +690,21 @@ check_got (GIOPMessageQueueEntry *ent)
  (ent->cnx->parent.status == LINK_DISCONNECTED));
 }
 
+static glong giop_initial_timeout_limit = GIOP_INITIAL_TIMEOUT_LIMIT;
+
+void
+giop_recv_set_timeout (const glong timeout)
+{
+ if (0 < timeout) /* We really do not want (timeout <= 0) as that would potentially block forever */
+ giop_initial_timeout_limit = timeout;
+}
+
 GIOPRecvBuffer *
-giop_recv_buffer_get (GIOPMessageQueueEntry *ent)
+giop_recv_buffer_get (GIOPMessageQueueEntry *ent,
+      gboolean *timeout)
 {
  GIOPThread *tdata = giop_thread_self ();
+ GTimeVal tval;
 
  thread_switch:
  if (giop_thread_io ()) {
@@ -704,8 +715,17 @@ giop_recv_buffer_get (GIOPMessageQueueEn
  ent_unlock (ent);
  giop_thread_queue_process (tdata);
  ent_lock (ent);
- } else
- g_cond_wait (tdata->incoming, tdata->lock);
+ } else {
+ if (0 < giop_initial_timeout_limit) {
+ g_get_current_time (&tval);
+ g_time_val_add (&tval, giop_initial_timeout_limit);
+ }
+ if (!g_cond_timed_wait (tdata->incoming, tdata->lock, ((0 < giop_initial_timeout_limit) ? &tval : NULL))) {
+ *timeout = TRUE;
+ break;
+ } else
+ *timeout = FALSE;
+ }
  }
 
  ent_unlock (ent);
@@ -1352,3 +1372,4 @@ giop_recv_buffer_use_buf (GIOPConnection
 
  return buf;
 }
+
Index: src/orb/orb-core/corba-orb.c
===================================================================
RCS file: /cvs/gnome/ORBit2/src/orb/orb-core/corba-orb.c,v
retrieving revision 1.123
diff -u -p -r1.123 corba-orb.c
--- src/orb/orb-core/corba-orb.c 16 Aug 2006 09:28:30 -0000 1.123
+++ src/orb/orb-core/corba-orb.c 5 Dec 2006 12:04:07 -0000
@@ -61,7 +61,7 @@ static char        *orbit_debug_options
 static char        *orbit_naming_ref         = NULL;
 static GSList      *orbit_initref_list       = NULL;
 static gboolean     orbit_use_corbaloc       = FALSE;
-
+static gint         orbit_timeout_limit      = -1;
 void
 ORBit_ORB_start_servers (CORBA_ORB orb)
 {
@@ -417,8 +417,8 @@ CORBA_ORB_init (int *argc, char **argv,
  }
 #endif /* G_ENABLE_DEBUG */
 
-
  giop_recv_set_limit (orbit_initial_recv_limit);
+ giop_recv_set_timeout (orbit_timeout_limit);
  giop_init (thread_safe,
    orbit_use_ipv4 || orbit_use_ipv6 ||
    orbit_use_irda || orbit_use_ssl);
@@ -1467,6 +1467,7 @@ const ORBit_option orbit_supported_optio
  { "ORBDebugFlags",      ORBIT_OPTION_STRING,  &orbit_debug_options },
  { "ORBInitRef",         ORBIT_OPTION_KEY_VALUE,  &orbit_initref_list},
  { "ORBCorbaloc",        ORBIT_OPTION_BOOLEAN, &orbit_use_corbaloc},
+ { "GIOPTimeoutLimit",   ORBIT_OPTION_INT,     &orbit_timeout_limit },
  { NULL,                 0,                    NULL },
 };
 
Index: src/orb/orb-core/orbit-small.c
===================================================================
RCS file: /cvs/gnome/ORBit2/src/orb/orb-core/orbit-small.c,v
retrieving revision 1.97
diff -u -p -r1.97 orbit-small.c
--- src/orb/orb-core/orbit-small.c 18 May 2006 14:20:49 -0000 1.97
+++ src/orb/orb-core/orbit-small.c 5 Dec 2006 12:04:08 -0000
@@ -591,6 +591,7 @@ ORBit_small_invoke_stub (CORBA_Object  
  GIOPRecvBuffer         *recv_buffer = NULL;
  CORBA_Object            xt_proxy = CORBA_OBJECT_NIL;
  ORBitPolicy            *invoke_policy = CORBA_OBJECT_NIL;
+ gboolean                timeout = FALSE;
 
  if (!obj) {
  dprintf (MESSAGES, "Cannot invoke method on null object\n");
@@ -654,7 +655,9 @@ ORBit_small_invoke_stub (CORBA_Object  
  goto clean_out;
  }
 
- recv_buffer = giop_recv_buffer_get (&mqe);
+ recv_buffer = giop_recv_buffer_get (&mqe, &timeout);
+ if (timeout)
+ goto timeout_exception;
 
  switch (orbit_small_demarshal (obj, &cnx, recv_buffer, ev,
        ret, m_data, args))
@@ -698,6 +701,12 @@ ORBit_small_invoke_stub (CORBA_Object  
  tprintf ("[System exception comm failure] )");
  CORBA_exception_set_system (ev, ex_CORBA_COMM_FAILURE,
     completion_status);
+ goto clean_out;
+
+ timeout_exception:
+ tprintf ("[System exception timeout] )");
+ CORBA_exception_set_system (ev, ex_CORBA_TIMEOUT,
+    CORBA_COMPLETED_NO);
  goto clean_out;
 }
 


_______________________________________________
orbit-list mailing list
orbit-list@...
http://mail.gnome.org/mailman/listinfo/orbit-list

Re: [PATCH] Timeout support - Please review

by colding :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 2006-12-05 at 13:06 +0100, Jules Colding wrote:

> On Tue, 2006-12-05 at 10:47 +0000, Gustavo J. A. M. Carneiro wrote:
> > On Ter, 2006-12-05 at 09:50 +0100, Jules Colding wrote:
> > > On Mon, 2006-12-04 at 14:22 +0100, Jules Colding wrote:
> > > > Please review and test this patch. I'm not entirely convinced that I
> > > > haven't introduced a bad bug or two but at least it works(*) here.
> > >
> > > No responses yet, so it seems that my patch is of the usual flawless
> > > quality ;-)
> >
> >   Heh :)
> >  
> >   In fact, the patch sounds really nice, a sign of ORBit2 maturing for
> > TCP/IP connections (which is traditionally seldom tested).
> >
> >   My only comment is that the changelog entry could use a bit more
> > detail.  You only mention changes in one function, but the patch changes
> > 4 functions

BTW: The changes to link_wait() is in linc2/ChangeLog.

--
  jules


_______________________________________________
orbit-list mailing list
orbit-list@...
http://mail.gnome.org/mailman/listinfo/orbit-list

Re: [PATCH] Timeout support - Please review

by Gustavo J. A. M. Carneiro :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Ter, 2006-12-05 at 13:06 +0100, Jules Colding wrote:

> On Tue, 2006-12-05 at 10:47 +0000, Gustavo J. A. M. Carneiro wrote:
> > On Ter, 2006-12-05 at 09:50 +0100, Jules Colding wrote:
> > > On Mon, 2006-12-04 at 14:22 +0100, Jules Colding wrote:
> > > > Please review and test this patch. I'm not entirely convinced that I
> > > > haven't introduced a bad bug or two but at least it works(*) here.
> > >
> > > No responses yet, so it seems that my patch is of the usual flawless
> > > quality ;-)
> >
> >   Heh :)
> >  
> >   In fact, the patch sounds really nice, a sign of ORBit2 maturing for
> > TCP/IP connections (which is traditionally seldom tested).
> >
> >   My only comment is that the changelog entry could use a bit more
> > detail.  You only mention changes in one function, but the patch changes
> > 4 functions and even adds a new commandline option which deserves a
> > mention in the changelog too.
>
> Here is the new and better documented patch.

  Wow!  I wasn't asking for so much level detail... :)

  But I guess this is good, if a bit unusual :)

  Best regards,

--
Gustavo J. A. M. Carneiro
<gjc@...> <gustavo@...>
The universe is always one step beyond logic.


_______________________________________________
orbit-list mailing list
orbit-list@...
http://mail.gnome.org/mailman/listinfo/orbit-list

Re: Timeout support - Please review

by Michael Meeks-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Jules,

On Mon, 2006-12-04 at 14:22 +0100, Jules Colding wrote:
> Any and all occurrences of g_cond_wait() has the potential to block
> forever if the remote server is physically disconnected or powered of.

        Urgh - but also, some methods take longer than 30 seconds to execute,
and will not respond in that time.

        Also - when you are debugging, it is extremely normal to want to pause
everything in the debugger to examine some interaction - and you don't
expect everything to fail in unpredictable (or even silent ways) when
you continue.

        So: can we do this for IPv4/6 connections only ? local unix
            socket connections have strong lifecycle guarentees and this
            covers the desktop use-case nicely.

> + * src/linc.c (link_wait): Do not wait forever for the link
> + condition to get signaled.
..
> + if (!g_cond_timed_wait (tdata->incoming, tdata->lock, ((0 < giop_initial_timeout_limit) ? &tval : NULL))) {
> + *timeout = TRUE;
> + break;

        So I'm convinced this code is in the wrong place - worse, the link_wait
function now -releases- a mutex it didn't take itself: which looks
horribly un-maintainable: who took that lock ? and worse doing:

        if (foo_is_locked())
                unlock_foo();

        is just grotesquely racey.

        So - IMHO, the -right- way to implement this is rather simpler:

        * in the IO thread, we need to add a simple 'timeout' source to
          the glib mainloop, that will timeout after (however long) and
          in this case just push a constructed CORBA_COMM exception into
          a synthesised reply that we shove into the queue as normal:
          thus signalling the waiting caller.

        Surely that is simpler, easier, touches just 1 place, and doesn't
introduce odd behavior ? :-)

        Thanks,

                Michael.

PS. if you really want patch review, can you CC me explicitly.
--
 michael.meeks@...  <><, Pseudo Engineer, itinerant idiot


_______________________________________________
orbit-list mailing list
orbit-list@...
http://mail.gnome.org/mailman/listinfo/orbit-list

Re: Timeout support - Please review

by colding :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Michael,

On Tue, 2007-06-12 at 10:02 +0100, Michael Meeks wrote:
> On Mon, 2006-12-04 at 14:22 +0100, Jules Colding wrote:
> > Any and all occurrences of g_cond_wait() has the potential to block
> > forever if the remote server is physically disconnected or powered of.
>
> Urgh - but also, some methods take longer than 30 seconds to execute,
> and will not respond in that time.

Yes, you are right.


> So: can we do this for IPv4/6 connections only ? local unix
>    socket connections have strong lifecycle guarentees and this
>    covers the desktop use-case nicely.

Yes.


> > + * src/linc.c (link_wait): Do not wait forever for the link
> > + condition to get signaled.
> ..
> > + if (!g_cond_timed_wait (tdata->incoming, tdata->lock, ((0 < giop_initial_timeout_limit) ? &tval : NULL))) {
> > + *timeout = TRUE;
> > + break;
>
> So I'm convinced this code is in the wrong place - worse, the link_wait
> function now -releases- a mutex it didn't take itself: which looks
> horribly un-maintainable: who took that lock ? and worse doing:
>
> if (foo_is_locked())
> unlock_foo();
>
> is just grotesquely racey.

Yes, I know. Despite my previous bragging about "flawless quality" I was
actually a little worried about this too...


> So - IMHO, the -right- way to implement this is rather simpler:
>
> * in the IO thread, we need to add a simple 'timeout' source to
>  the glib mainloop, that will timeout after (however long) and
>  in this case just push a constructed CORBA_COMM exception into
>  a synthesised reply that we shove into the queue as normal:
>  thus signalling the waiting caller.
>
> Surely that is simpler, easier, touches just 1 place, and doesn't
> introduce odd behavior ? :-)

Absolutely. I'll reverse my previous patch and look into this approach
instead.

Stay tuned,
  jules



_______________________________________________
orbit-list mailing list
orbit-list@...
http://mail.gnome.org/mailman/listinfo/orbit-list