|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
[RFC] Remove pendingLimit from OSyncQueue-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1 Hi, I would like to remove pendingLimit from OSyncQueue to fix the dead lock of ticket #1078. The pendingLimit of the OSyncQueue and the not fully independent messages can produced dead locks. Today the queue already causes deadlocks. What happens? The OSyncQueue stops dispatching if pendingLimit number of messages waiting for an answer. This is correct IPC behaviour if the messages are independent but this is not correct in case of OpenSync. Some protocols (e.g. SyncML) can only flush once for an object type (SyncML datastore). This means changes are collected and send if the maximum message size of the protocol is reached or all OpenSync messages are present. This means the only guaranteed send operation is done if the OpenSync message committed_all is handled. A first workaround was to commit all changes immediately and abort the complete synchronization if an error happens by signalling the error to the committed_all context. This does not work because the mapping (of SyncML) is potentially sent after the changes are received. OpenSync can handle mappings only for not committed changes. So the decision is simply to add a new mechanism for mapping IDs or to remove the pendingLimit which creates a dead lock between two originally independent queues. Best regards Michael - -- ___________________________________________________________________ Michael Bell Humboldt-Universitaet zu Berlin Tel.: +49 (0)30-2093 2482 ZE Computer- und Medienservice Fax: +49 (0)30-2093 2704 Unter den Linden 6 michael.bell@... D-10099 Berlin ___________________________________________________________________ PGP Fingerprint: 09E4 3D29 4156 2774 0F2C C643 D8BD 1918 2030 5AAB -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJ3bGV2L0ZGCAwWqsRAnrFAKCA56uxFW4qLJCKvC9lbSpQdFI7fwCggz0R QcKCjxg+huA1bn9yDj2Njf4= =Q2wY -----END PGP SIGNATURE----- ------------------------------------------------------------------------------ This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com _______________________________________________ Opensync-devel mailing list Opensync-devel@... https://lists.sourceforge.net/lists/listinfo/opensync-devel |
|
|
Re: [RFC] Remove pendingLimit from OSyncQueueHi,
I believe the OpenSync API must support the way the SyncML protocol works. This means: 1) The pendingLimit should be removed. As Michael writes, the SyncML protocol groups updates, so there may be any number of pending updates before a commit can be completed. (Maybe it would be possible to let the plugin define the pendingLimit (including setting it to unlimited))? 2) The mapping issue needs to be handled. The SyncML protocol defines that the mappings are send after all changes, so the OpenSync API needs to be augmented with a way to handle this. /Henrik Michael Bell wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > I would like to remove pendingLimit from OSyncQueue to fix the dead lock > of ticket #1078. The pendingLimit of the OSyncQueue and the not fully > independent messages can produced dead locks. Today the queue already > causes deadlocks. > > What happens? > > The OSyncQueue stops dispatching if pendingLimit number of messages > waiting for an answer. This is correct IPC behaviour if the messages are > independent but this is not correct in case of OpenSync. > > Some protocols (e.g. SyncML) can only flush once for an object type > (SyncML datastore). This means changes are collected and send if the > maximum message size of the protocol is reached or all OpenSync messages > are present. This means the only guaranteed send operation is done if > the OpenSync message committed_all is handled. > > A first workaround was to commit all changes immediately and abort the > complete synchronization if an error happens by signalling the error to > the committed_all context. This does not work because the mapping (of > SyncML) is potentially sent after the changes are received. OpenSync can > handle mappings only for not committed changes. > > So the decision is simply to add a new mechanism for mapping IDs or to > remove the pendingLimit which creates a dead lock between two originally > independent queues. > > Best regards > > Michael > - -- > ___________________________________________________________________ > > Michael Bell Humboldt-Universitaet zu Berlin > > Tel.: +49 (0)30-2093 2482 ZE Computer- und Medienservice > Fax: +49 (0)30-2093 2704 Unter den Linden 6 > michael.bell@... D-10099 Berlin > ___________________________________________________________________ > > > ------------------------------------------------------------------------------ This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com _______________________________________________ Opensync-devel mailing list Opensync-devel@... https://lists.sourceforge.net/lists/listinfo/opensync-devel |
|
|
Re: [RFC] Remove pendingLimit from OSyncQueueOn Thursday 09 April 2009 10:28:05 am Michael Bell wrote:
> Hi, > > I would like to remove pendingLimit from OSyncQueue to fix the dead lock > of ticket #1078. The pendingLimit of the OSyncQueue and the not fully > independent messages can produced dead locks. Today the queue already > causes deadlocks. > > What happens? > > The OSyncQueue stops dispatching if pendingLimit number of messages > waiting for an answer. This is correct IPC behaviour if the messages are > independent but this is not correct in case of OpenSync. > > Some protocols (e.g. SyncML) can only flush once for an object type > (SyncML datastore). This means changes are collected and send if the > maximum message size of the protocol is reached or all OpenSync messages > are present. This means the only guaranteed send operation is done if > the OpenSync message committed_all is handled. > > A first workaround was to commit all changes immediately and abort the > complete synchronization if an error happens by signalling the error to > the committed_all context. This does not work because the mapping (of > SyncML) is potentially sent after the changes are received. OpenSync can > handle mappings only for not committed changes. That's wrong. See my mail to your batch_commit RFC thread. > So the decision is simply to add a new mechanism for mapping IDs or to > remove the pendingLimit which creates a dead lock between two originally > independent queues. I don't see the need of a new mapping ID mechanism. We Just need to adapt when the UID could get updated by the plugin. Right now this can only happen in the "commit (change) context" - which i thought would be perfectly fine. If you write in SyncML an entry - when do you get the peers mapping id of the just changed entry? I wonder if it's really impossible to update the change UID before reply the "commit (change) context". (Independent of the pendingLimit issue!) Maybe it would help to introduce a new osync_context_ interface instead. We change the definition of osync_context_report_success() within the commit context. Instead of (additionally) "updating" the UID of a change within the this commit context - this functions just ACKs that the commit got handled/scheduled within the plugin. Internally this would just frees the pendingQeue - no commit report to the Engine - no signalling to the OpenSync frontend to a write event. An additional osync_context_ interface could later report the changed UID after write. For that the plugin would need to increase the ref of the OSyncContext* and unref it once the change or error got reported with an osync_context_ interface. We actually could reuse here osync_context_update_change() which is used also in get_changes(). A plugin which is using commit sink funciton in async way must register a committed_all() sink function - to signal when all changes got finally committed. It's similar to your osync_mapping_alter_interface() proposal with the difference that there would be only internal changes - no public changes. Actually my longterm goal is that plugins only use the osync_context_ interface to report changes to the engine ;) Would this solve the pendingLimit issue? Best Regards, Daniel -- Daniel Gollub Geschaeftsfuehrer: Ralph Dehner FOSS Developer Unternehmenssitz: Vohburg B1 Systems GmbH Amtsgericht: Ingolstadt Mobil: +49-(0)-160 47 73 970 Handelsregister: HRB 3537 EMail: gollub@... http://www.b1-systems.de Adresse: B1 Systems GmbH, Osterfeldstraße 7, 85088 Vohburg http://pgpkeys.pca.dfn.de/pks/lookup?op=get&search=0xED14B95C2F8CA78D ------------------------------------------------------------------------------ This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com _______________________________________________ Opensync-devel mailing list Opensync-devel@... https://lists.sourceforge.net/lists/listinfo/opensync-devel |
|
|
Re: [RFC] Remove pendingLimit from OSyncQueue-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1 Daniel Gollub wrote: > On Thursday 09 April 2009 10:28:05 am Michael Bell wrote: >> >> So the decision is simply to add a new mechanism for mapping IDs or to >> remove the pendingLimit which creates a dead lock between two originally >> independent queues. > > I don't see the need of a new mapping ID mechanism. We Just need to adapt when > the UID could get updated by the plugin. Right now this can only happen in the > "commit (change) context" - which i thought would be perfectly fine. > > If you write in SyncML an entry - when do you get the peers mapping id of the > just changed entry? Potentially after all changes are available. > I wonder if it's really impossible to update the change UID before reply the > "commit (change) context". (Independent of the pendingLimit issue!) > > Maybe it would help to introduce a new osync_context_ interface instead. > > We change the definition of osync_context_report_success() within the commit > context. Instead of (additionally) "updating" the UID of a change within the > this commit context - this functions just ACKs that the commit got > handled/scheduled within the plugin. Internally this would just frees the > pendingQeue - no commit report to the Engine - no signalling to the OpenSync > frontend to a write event. This would work but it is a hack. > An additional osync_context_ interface could later report the changed UID > after write. For that the plugin would need to increase the ref of the > OSyncContext* and unref it once the change or error got reported with an > osync_context_ interface. We actually could reuse here > osync_context_update_change() which is used also in get_changes(). If only the pendingLimit is influenced by the former call then we don't need additional interfaces because the change was not written. This means we can still use osync_change_set_uid. > A plugin which is using commit sink funciton in async way must register a > committed_all() sink function - to signal when all changes got finally > committed. No problem, this is what the SyncML plugin does. > Would this solve the pendingLimit issue? I think so. I'm just not sure about introducing an API function to hack the IPC stuff. Perhaps we should use a function name like osync_context_set_async. The problem is that the IPC stuff is already asynchronous and we just make it really async. Are there any other really asynchronous plugins? I don't think that a dependency between two pipes is a good idea but I understand that IPC has limits. I also used IBM MQ series in the past which has nothing to do with IPC. So is our message queue implementation a real IPC implementation which requires limits? Best regards Michael - -- ___________________________________________________________________ Michael Bell Humboldt-Universitaet zu Berlin Tel.: +49 (0)30-2093 2482 ZE Computer- und Medienservice Fax: +49 (0)30-2093 2704 Unter den Linden 6 michael.bell@... D-10099 Berlin ___________________________________________________________________ PGP Fingerprint: 09E4 3D29 4156 2774 0F2C C643 D8BD 1918 2030 5AAB -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJ5IC52L0ZGCAwWqsRAiB9AKCrD9WkHLA6ckFxbBi01JEWOxsgAACgo8Ib nvRJ5J2Rqw/EJeJADuvXdVs= =qWzv -----END PGP SIGNATURE----- ------------------------------------------------------------------------------ This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com _______________________________________________ Opensync-devel mailing list Opensync-devel@... https://lists.sourceforge.net/lists/listinfo/opensync-devel |
|
|
Re: [RFC] Remove pendingLimit from OSyncQueueOn Tuesday 14 April 2009 02:25:29 pm Michael Bell wrote:
> Daniel Gollub wrote: > > On Thursday 09 April 2009 10:28:05 am Michael Bell wrote: > >> So the decision is simply to add a new mechanism for mapping IDs or to > >> remove the pendingLimit which creates a dead lock between two originally > >> independent queues. > > > > I don't see the need of a new mapping ID mechanism. We Just need to adapt > > when the UID could get updated by the plugin. Right now this can only > > happen in the "commit (change) context" - which i thought would be > > perfectly fine. > > > > If you write in SyncML an entry - when do you get the peers mapping id of > > the just changed entry? > > Potentially after all changes are available. > > > I wonder if it's really impossible to update the change UID before reply > > the "commit (change) context". (Independent of the pendingLimit issue!) > > > > Maybe it would help to introduce a new osync_context_ interface instead. > > > > We change the definition of osync_context_report_success() within the > > commit context. Instead of (additionally) "updating" the UID of a change > > within the this commit context - this functions just ACKs that the commit > > got handled/scheduled within the plugin. Internally this would just frees > > the pendingQeue - no commit report to the Engine - no signalling to the > > OpenSync frontend to a write event. > > This would work but it is a hack. Ok - this "hack" already exists in get_changes() context with osync_context_report_change(). (But also for other reasons - like: mixed objtype syncing). > > > An additional osync_context_ interface could later report the changed UID > > after write. For that the plugin would need to increase the ref of the > > OSyncContext* and unref it once the change or error got reported with an > > osync_context_ interface. We actually could reuse here > > osync_context_update_change() which is used also in get_changes(). > > If only the pendingLimit is influenced by the former call then we don't > need additional interfaces because the change was not written. This > means we can still use osync_change_set_uid. No - since the plugin process space and the engine process space are different and still need to handle modificatoin on OSyncChange within some context interface. The reason is that the OSyncChange pointer you have in the plugin is not the same pointer you have in the engine. So if you don't change or add a context interface - then after the osync_context_report_success() call there would be no way to change the UID. For that reason we need one context call to ACK the message - to get the message from the pending reply list. And one context interface which reports the (uid) change after the entry got written. > > > A plugin which is using commit sink funciton in async way must register a > > committed_all() sink function - to signal when all changes got finally > > committed. > > No problem, this is what the SyncML plugin does. > > > Would this solve the pendingLimit issue? > > I think so. I'm just not sure about introducing an API function to hack > the IPC stuff. It's not a hack - we do the similar thing inside get_change() context. The different is that get_changes() get only called once. > Perhaps we should use a function name like > osync_context_set_async. The problem is that the IPC stuff is already > asynchronous and we just make it really async. Are there any other > really asynchronous plugins? Don't know any - maybe the qtopia-sync one. But we should introduce some example async plugin. > > I don't think that a dependency between two pipes is a good idea but I > understand that IPC has limits. I also used IBM MQ series in the past > which has nothing to do with IPC. So is our message queue implementation > a real IPC implementation which requires limits? No Idea - maybe Graham kann answer this. With the new osync_context_ interface there would be no dependency - afaik. The commit function would just get async. Best Regards, Daniel -- Daniel Gollub Geschaeftsfuehrer: Ralph Dehner FOSS Developer Unternehmenssitz: Vohburg B1 Systems GmbH Amtsgericht: Ingolstadt Mobil: +49-(0)-160 47 73 970 Handelsregister: HRB 3537 EMail: gollub@... http://www.b1-systems.de Adresse: B1 Systems GmbH, Osterfeldstraße 7, 85088 Vohburg http://pgpkeys.pca.dfn.de/pks/lookup?op=get&search=0xED14B95C2F8CA78D ------------------------------------------------------------------------------ This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com _______________________________________________ Opensync-devel mailing list Opensync-devel@... https://lists.sourceforge.net/lists/listinfo/opensync-devel |
|
|
Re: [RFC] Remove pendingLimit from OSyncQueueOn Tuesday 14 April 2009 13:39:43 Daniel Gollub wrote:
> On Tuesday 14 April 2009 02:25:29 pm Michael Bell wrote: > > I don't think that a dependency between two pipes is a good idea but I > > understand that IPC has limits. I also used IBM MQ series in the past > > which has nothing to do with IPC. So is our message queue implementation > > a real IPC implementation which requires limits? > > No Idea - maybe Graham kann answer this. Sorry about the delay -- I have been away and have not had a chance to spend any time on this until today. I understand the problem, and the current timeout/limit mechanism definitely deadlocks with the way the async plugins work today (I am ignoring any changes suggested in this thread as I am not sure I understand exactly what has been proposed). The pendingLimit is there to allow timeouts to work properly. If the pendingLimit is just removed, the timeouts will break as they did before (the timeout starts counting at the wrong time and so if there are a large number of transactions queued up the timeout fires too early). But let's review what timeouts are for and how we would **like** them to behave. As I understand it, the main purpose of the timeouts is to deal with cases where the remote device (or some intermediary) has got stuck and is no longer proceeding with transactions (but not returning errors). It also helps with cases where the plugin tries to send a message but does not notice that there is an error (e.g. a socket has been disconnected) and it will never get a reply. This is, of course, a plugin bug but it is useful that the timeout mechanism also protects against that problem. There is a secondary use for timeouts and that is to protect against problems in the IPC mechanism itself -- e.g. a process has stopped and is no longer reading the pipe. This is a smaller consideration and can be handled by mechanisms within the IPC itself if necessary, so let's ignore it for now. There seem to be three plugin architectures which are relevant (I thought, when I was rewriting the timeout code, that there were only the first two but I now realise there is a third): 1) Synchronous plugin (most plugins are like this, I believe): when a transaction is received by the plugin (e.g. Connect or Get Changes or Commit) it does synchronous writes to send messages to the device and synchronous reads to get messages from the device. If the device stops responding, the thread executing the plugin will just wait. No other plugin messages will be handled while it is waiting. 2) Asynchronous but transaction-at-a-time plugin (maybe there are none like this): when the transaction is received, the plugin sends the message to the device and then returns. The thread polls the socket and resumes when the response is received. However, other plugin messages can be handled while it is waiting -- so further updates will cause further messages to be sent to the device. If the device stops responding, the engine will keep sending updates which the plugin will send. although it is not seeing any responses. 3) Aysnchronous, multiple transaction plugin (like SyncML): when the transaction is received, it is stored internally to the plugin. Nothing is sent until a message fills up or the last transaction is received. Then all updates are sent and, when responses are received, the updates are completed. If the device stops responding then all or some updates will not receive a response. For 1 and 2, the timeout is protecting each single commit and the value should be set based on the time needed for that transaction. In the case of 2, this means the pendingLimit is needed -- it limits the number of updates that might already be queued ahead of this one and so allows the timeout value to be calculated (i.e. pendingLimit * maximum time for one update). For 3, however, it is much harder. One option would be to set the timeouts for each commit (and the commit_all) based on the time the device needs to complete a maximum sized message of updates. On the other hand, that doesn't allow for the fact that the OpenSync engine itself might take some time (in complex cases) to even provide enough updates to fill a message: timeouts were not intended to have to take into account OpenSync engine processing. Another option for 3 is to set no timeouts at all on the individual commits, but set a timeout on the commit_all. The problem with that is that the timeout value for the commit_all is potentially unlimited (it is not limited to a single message of commits as the commit_all will start as soon as the commits have all been queued and hundreds of messages may have been sent to the device). On the other hand, the plugin itself knows what is going on. So, I think the best option for case 3 is for the plugin itself to control the timeouts. I suggest that in this case, there are no timeouts on the commit or commit_all, but that the plugin itself sets a timeout when it has assembled a message and sent it to the device. I.e. add some sort of OsyncStartTimeout(int timeout) and OSyncStopTimeout() calls. The plugin would start the timeout when the first message was sent to the device. Whenever a response is received, the timeout would be stopped and, if there were one or more messages still waiting for responses, it would be started again. If the timeout actually fires, then the OSyncQueue code completes all pending operations with a timeout error (just as in the existing timeout processing). This does mean that for plugins using this third architecture, they have to have some extra complexity. But I can't see an alternative, if we want to keep the timeout protection. Of course, we could decide that for this release of OpenSync that we disable timeout processing altogether -- and add it back in later (with additions to the API at that time). Does anyone have an alternative suggestion? If not, I will spec up a suggested API for the timeout operations. Graham ------------------------------------------------------------------------------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensign option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ Opensync-devel mailing list Opensync-devel@... https://lists.sourceforge.net/lists/listinfo/opensync-devel |
| Free embeddable forum powered by Nabble | Forum Help |