Ticket 1078 - Remove pendingLimit from OSyncQueue

View: New views
5 Messages — Rating Filter:   Alert me  

Ticket 1078 - Remove pendingLimit from OSyncQueue

by Henrik /KaarPoSoft :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear all,

Are there any progress / comments / plans on ticket 1078?
See also comments on the mailinglist regarding
"Remove pendingLimit from OSyncQueue".

As far as I can see, this problem prevents any sync with SyncML devices,
as the SyncML protocol may have several changes in one message,
which I believe OpenSync cannot handle now...

/Henrik


------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Opensync-devel mailing list
Opensync-devel@...
https://lists.sourceforge.net/lists/listinfo/opensync-devel

Re: Ticket 1078 - Remove pendingLimit from OSyncQueue

by Graham Cobb-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Friday 18 September 2009 01:22:25 Henrik /KaarPoSoft wrote:
> Are there any progress / comments / plans on ticket 1078?
> See also comments on the mailinglist regarding
> "Remove pendingLimit from OSyncQueue".

I was hoping someone might comment on my suggestion back in April (attached).  
On the other hand, I have not done anything about it since that message --
not even drafted the API necessary to implement my suggestion.  My fault,
sorry.

> As far as I can see, this problem prevents any sync with SyncML devices,
> as the SyncML protocol may have several changes in one message,
> which I believe OpenSync cannot handle now...

My proposal is to disable the timeout handling completely for now and for me
(or someone else if they wish) to implement a real fix over the next few
months.  I am not going to get a chance to work on this for the next several
weeks, unfortunately (although I can check in a change to just disable
timeout handling straight away if that is what people want).

The risk is that the real fix may involve an API change.  However, I think
that can be done with (i) no changes needed in plugins which do not do the
SyncML-style batching, and (ii) the API changes would be limited to some
additonal functions, not changes to the existing API.

Graham

On Tuesday 14 April 2009 13:39:43 Daniel Gollub wrote:
> On Tuesday 14 April 2009 02:25:29 pm Michael Bell wrote:
> > I don't think that a dependency between two pipes is a good idea but I
> > understand that IPC has limits. I also used IBM MQ series in the past
> > which has nothing to do with IPC. So is our message queue implementation
> > a real IPC implementation which requires limits?
>
> No Idea - maybe Graham kann answer this.

Sorry about the delay -- I have been away and have not had a chance to spend
any time on this until today.

I understand the problem, and the current timeout/limit mechanism definitely
deadlocks with the way the async plugins work today (I am ignoring any
changes suggested in this thread as I am not sure I understand exactly what
has been proposed).

The pendingLimit is there to allow timeouts to work properly.  If the
pendingLimit is just removed, the timeouts will break as they did before (the
timeout starts counting at the wrong time and so if there are a large number
of transactions queued up the timeout fires too early).  But let's review
what timeouts are for and how we would **like** them to behave.

As I understand it, the main purpose of the timeouts is to deal with cases
where the remote device (or some intermediary) has got stuck and is no longer
proceeding with transactions (but not returning errors).  It also helps with
cases where the plugin tries to send a message but does not notice that there
is an error (e.g. a socket has been disconnected) and it will never get a
reply.  This is, of course, a plugin bug but it is useful that the timeout
mechanism also protects against that problem.

There is a secondary use for timeouts and that is to protect against problems
in the IPC mechanism itself -- e.g. a process has stopped and is no longer
reading the pipe.  This is a smaller consideration and can be handled by
mechanisms within the IPC itself if necessary, so let's ignore it for now.

There seem to be three plugin architectures which are relevant (I thought,
when I was rewriting the timeout code, that there were only the first two but
I now realise there is a third):

1) Synchronous plugin (most plugins are like this, I believe): when a
transaction is received by the plugin (e.g. Connect or Get Changes or Commit)
it does synchronous writes to send messages to the device and synchronous
reads to get messages from the device.  If the device stops responding, the
thread executing the plugin will just wait.  No other plugin messages will be
handled while it is waiting.

2) Asynchronous but transaction-at-a-time plugin (maybe there are none like
this): when the transaction is received, the plugin sends the message to the
device and then returns.  The thread polls the socket and resumes when the
response is received. However, other plugin messages can be handled while it
is waiting -- so further updates will cause further messages to be sent to
the device.  If the device stops responding, the engine will keep sending
updates which the plugin will send. although it is not seeing any responses.

3) Aysnchronous, multiple transaction plugin (like SyncML): when the
transaction is received, it is stored internally to the plugin.  Nothing is
sent until a message fills up or the last transaction is received. Then all
updates are sent and, when responses are received, the updates are completed.  
If the device stops responding then all or some updates will not receive a
response.

For 1 and 2, the timeout is protecting each single commit and the value should
be set based on the time needed for that transaction.  In the case of 2, this
means the pendingLimit is needed -- it limits the number of updates that
might already be queued ahead of this one and so allows the timeout value to
be calculated (i.e. pendingLimit * maximum time for one update).

For 3, however, it is much harder.  One option would be to set the timeouts
for each commit (and the commit_all) based on the time the device needs to
complete a maximum sized message of updates.  On the other hand, that doesn't
allow for the fact that the OpenSync engine itself might take some time (in
complex cases) to even provide enough updates to fill a message: timeouts
were not intended to have to take into account OpenSync engine processing.

Another option for 3 is to set no timeouts at all on the individual commits,
but set a timeout on the commit_all.  The problem with that is that the
timeout value for the commit_all is potentially unlimited (it is not limited
to a single message of commits as the commit_all will start as soon as the
commits have all been queued and hundreds of messages may have been sent to
the device).

On the other hand, the plugin itself knows what is going on.  So, I think the
best option for case 3 is for the plugin itself to control the timeouts.  I
suggest that in this case, there are no timeouts on the commit or commit_all,
but that the plugin itself sets a timeout when it has assembled a message and
sent it to the device.  I.e. add some sort of OsyncStartTimeout(int timeout)
and OSyncStopTimeout() calls.  The plugin would start the timeout when the
first message was sent to the device.  Whenever a response is received, the
timeout would be stopped and, if there were one or more messages still
waiting for responses, it would be started again.  If the timeout actually
fires, then the OSyncQueue code completes all pending operations with a
timeout error (just as in the existing timeout processing).

This does mean that for plugins using this third architecture, they have to
have some extra complexity.  But I can't see an alternative, if we want to
keep the timeout protection.  Of course, we could decide that for this
release of OpenSync that we disable timeout processing altogether -- and add
it back in later (with additions to the API at that time).

Does anyone have an alternative suggestion?  If not, I will spec up a
suggested API for the timeout operations.

Graham


------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Opensync-devel mailing list
Opensync-devel@...
https://lists.sourceforge.net/lists/listinfo/opensync-devel

Re: Ticket 1078 - Remove pendingLimit from OSyncQueue

by Christian Hilgers-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Graham Cobb schrieb:

> The risk is that the real fix may involve an API change.  However, I think
> that can be done with (i) no changes needed in plugins which do not do the
> SyncML-style batching, and (ii) the API changes would be limited to some
> additonal functions, not changes to the existing API.

This should be possible without an API break, so no stopper for releasing 0.39

Christian
--
Christian Hilgers                  |ConSol*
Tel.   +49.2102.994-423            |Consulting&Solutions Software GmbH
Fax    +49.2102.994-411            |Berliner Str. 101, 40880 Ratingen
email: Christian.Hilgers@... |WWW: http://www.consol.de

------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Opensync-devel mailing list
Opensync-devel@...
https://lists.sourceforge.net/lists/listinfo/opensync-devel

Re: Ticket 1078 - Remove pendingLimit from OSyncQueue

by Henrik /KaarPoSoft :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear all,

I am not familiar enough with the inner workings of OpenSync to comment
on the details ))-:

However, I do think a fix is necessary.
It is great to freeze the API for 0.39/0.40, but it would be great if it
would actually work too!

/Henrik



Christian Hilgers wrote:

> Graham Cobb schrieb:
>
>  
>> The risk is that the real fix may involve an API change.  However, I think
>> that can be done with (i) no changes needed in plugins which do not do the
>> SyncML-style batching, and (ii) the API changes would be limited to some
>> additonal functions, not changes to the existing API.
>>    
>
> This should be possible without an API break, so no stopper for releasing 0.39
>
> Christian
> --


Graham Cobb wrote:

> On Friday 18 September 2009 01:22:25 Henrik /KaarPoSoft wrote:
>  
>> Are there any progress / comments / plans on ticket 1078?
>> See also comments on the mailinglist regarding
>> "Remove pendingLimit from OSyncQueue".
>>    
>
> I was hoping someone might comment on my suggestion back in April (attached).  
> On the other hand, I have not done anything about it since that message --
> not even drafted the API necessary to implement my suggestion.  My fault,
> sorry.
>
>  
>> As far as I can see, this problem prevents any sync with SyncML devices,
>> as the SyncML protocol may have several changes in one message,
>> which I believe OpenSync cannot handle now...
>>    
>
> My proposal is to disable the timeout handling completely for now and for me
> (or someone else if they wish) to implement a real fix over the next few
> months.  I am not going to get a chance to work on this for the next several
> weeks, unfortunately (although I can check in a change to just disable
> timeout handling straight away if that is what people want).
>
> The risk is that the real fix may involve an API change.  However, I think
> that can be done with (i) no changes needed in plugins which do not do the
> SyncML-style batching, and (ii) the API changes would be limited to some
> additonal functions, not changes to the existing API.
>
> Graham
>  
>
> ------------------------------------------------------------------------
>
> Subject:
> Re: [Opensync-devel] [RFC] Remove pendingLimit from OSyncQueue
> From:
> Graham Cobb <g+opensync@...>
> Date:
> Sat, 25 Apr 2009 17:22:00 +0100
> To:
> Daniel Gollub <gollub@...>
>
> To:
> Daniel Gollub <gollub@...>
> CC:
> Michael Bell <michael.bell@...>, Opensync Devel
> <opensync-devel@...>
>
>
> On Tuesday 14 April 2009 13:39:43 Daniel Gollub wrote:
>  
>> On Tuesday 14 April 2009 02:25:29 pm Michael Bell wrote:
>>    
>>> I don't think that a dependency between two pipes is a good idea but I
>>> understand that IPC has limits. I also used IBM MQ series in the past
>>> which has nothing to do with IPC. So is our message queue implementation
>>> a real IPC implementation which requires limits?
>>>      
>> No Idea - maybe Graham kann answer this.
>>    
>
> Sorry about the delay -- I have been away and have not had a chance to spend
> any time on this until today.
>
> I understand the problem, and the current timeout/limit mechanism definitely
> deadlocks with the way the async plugins work today (I am ignoring any
> changes suggested in this thread as I am not sure I understand exactly what
> has been proposed).
>
> The pendingLimit is there to allow timeouts to work properly.  If the
> pendingLimit is just removed, the timeouts will break as they did before (the
> timeout starts counting at the wrong time and so if there are a large number
> of transactions queued up the timeout fires too early).  But let's review
> what timeouts are for and how we would **like** them to behave.
>
> As I understand it, the main purpose of the timeouts is to deal with cases
> where the remote device (or some intermediary) has got stuck and is no longer
> proceeding with transactions (but not returning errors).  It also helps with
> cases where the plugin tries to send a message but does not notice that there
> is an error (e.g. a socket has been disconnected) and it will never get a
> reply.  This is, of course, a plugin bug but it is useful that the timeout
> mechanism also protects against that problem.
>
> There is a secondary use for timeouts and that is to protect against problems
> in the IPC mechanism itself -- e.g. a process has stopped and is no longer
> reading the pipe.  This is a smaller consideration and can be handled by
> mechanisms within the IPC itself if necessary, so let's ignore it for now.
>
> There seem to be three plugin architectures which are relevant (I thought,
> when I was rewriting the timeout code, that there were only the first two but
> I now realise there is a third):
>
> 1) Synchronous plugin (most plugins are like this, I believe): when a
> transaction is received by the plugin (e.g. Connect or Get Changes or Commit)
> it does synchronous writes to send messages to the device and synchronous
> reads to get messages from the device.  If the device stops responding, the
> thread executing the plugin will just wait.  No other plugin messages will be
> handled while it is waiting.
>
> 2) Asynchronous but transaction-at-a-time plugin (maybe there are none like
> this): when the transaction is received, the plugin sends the message to the
> device and then returns.  The thread polls the socket and resumes when the
> response is received. However, other plugin messages can be handled while it
> is waiting -- so further updates will cause further messages to be sent to
> the device.  If the device stops responding, the engine will keep sending
> updates which the plugin will send. although it is not seeing any responses.
>
> 3) Aysnchronous, multiple transaction plugin (like SyncML): when the
> transaction is received, it is stored internally to the plugin.  Nothing is
> sent until a message fills up or the last transaction is received. Then all
> updates are sent and, when responses are received, the updates are completed.  
> If the device stops responding then all or some updates will not receive a
> response.
>
> For 1 and 2, the timeout is protecting each single commit and the value should
> be set based on the time needed for that transaction.  In the case of 2, this
> means the pendingLimit is needed -- it limits the number of updates that
> might already be queued ahead of this one and so allows the timeout value to
> be calculated (i.e. pendingLimit * maximum time for one update).
>
> For 3, however, it is much harder.  One option would be to set the timeouts
> for each commit (and the commit_all) based on the time the device needs to
> complete a maximum sized message of updates.  On the other hand, that doesn't
> allow for the fact that the OpenSync engine itself might take some time (in
> complex cases) to even provide enough updates to fill a message: timeouts
> were not intended to have to take into account OpenSync engine processing.
>
> Another option for 3 is to set no timeouts at all on the individual commits,
> but set a timeout on the commit_all.  The problem with that is that the
> timeout value for the commit_all is potentially unlimited (it is not limited
> to a single message of commits as the commit_all will start as soon as the
> commits have all been queued and hundreds of messages may have been sent to
> the device).
>
> On the other hand, the plugin itself knows what is going on.  So, I think the
> best option for case 3 is for the plugin itself to control the timeouts.  I
> suggest that in this case, there are no timeouts on the commit or commit_all,
> but that the plugin itself sets a timeout when it has assembled a message and
> sent it to the device.  I.e. add some sort of OsyncStartTimeout(int timeout)
> and OSyncStopTimeout() calls.  The plugin would start the timeout when the
> first message was sent to the device.  Whenever a response is received, the
> timeout would be stopped and, if there were one or more messages still
> waiting for responses, it would be started again.  If the timeout actually
> fires, then the OSyncQueue code completes all pending operations with a
> timeout error (just as in the existing timeout processing).
>
> This does mean that for plugins using this third architecture, they have to
> have some extra complexity.  But I can't see an alternative, if we want to
> keep the timeout protection.  Of course, we could decide that for this
> release of OpenSync that we disable timeout processing altogether -- and add
> it back in later (with additions to the API at that time).
>
> Does anyone have an alternative suggestion?  If not, I will spec up a
> suggested API for the timeout operations.
>
> Graham
>
>  
> ------------------------------------------------------------------------
>
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry® Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart your
> developing skills, take BlackBerry mobile applications to market and stay
> ahead of the curve. Join us from November 9-12, 2009. Register now!
> http://p.sf.net/sfu/devconf
> ------------------------------------------------------------------------
>
> _______________________________________________
> Opensync-devel mailing list
> Opensync-devel@...
> https://lists.sourceforge.net/lists/listinfo/opensync-devel
>  


------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Opensync-devel mailing list
Opensync-devel@...
https://lists.sourceforge.net/lists/listinfo/opensync-devel

Re: Ticket 1078 - Remove pendingLimit from OSyncQueue

by Michael Bell :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Henrik,

Henrik /KaarPoSoft wrote:
>
> Are there any progress / comments / plans on ticket 1078?
> See also comments on the mailinglist regarding
> "Remove pendingLimit from OSyncQueue".
>
> As far as I can see, this problem prevents any sync with SyncML devices,
> as the SyncML protocol may have several changes in one message,
> which I believe OpenSync cannot handle now...

I am too far away from the actual OpenSync development. I am really
happy if I can update the SyncML plugin to the actual OpenSync API until
the end of October.

So I'm sorry but I can only hope that Daniel or Bjoern find the time to
fix the issue.

Sorry Michael
- --
___________________________________________________________________

Michael Bell                        Humboldt-Universitaet zu Berlin

Tel.: +49 (0)30-2093 2482           ZE Computer- und Medienservice
Fax:  +49 (0)30-2093 2704           Unter den Linden 6
michael.bell@...       D-10099 Berlin
___________________________________________________________________

PGP Fingerprint: 09E4 3D29 4156 2774 0F2C  C643 D8BD 1918 2030 5AAB
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkq8iKcACgkQ2L0ZGCAwWqv1JQCfTkllksFtPxxY5F09awywLxg3
vk4AoJkwS98n/nDS5Frjf9Ahvm6wcrNt
=QTle
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Opensync-devel mailing list
Opensync-devel@...
https://lists.sourceforge.net/lists/listinfo/opensync-devel