Collection Synchronization for WebDAV

View: New views
9 Messages — Rating Filter:   Alert me  

Collection Synchronization for WebDAV

by Cyrus Daboo-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi folks,
FYI. Comments welcome.

A New Internet-Draft is available from the on-line Internet-Drafts
directories.


        Title : Collection Synchronization for WebDAV
        Author(s) : C. Daboo
        Filename : draft-daboo-webdav-sync-00.txt
        Pages : 14
        Date : 2007-7-3
       

   This specification defines an extension to WebDAV that allows
   efficient synchronization of the contents of a WebDAV collection.


A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-daboo-webdav-sync-00.txt


--
Cyrus Daboo



Re: Collection Synchronization for WebDAV

by Werner Donné :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi Cyrus,

When I want to sync my client with my server I send a PROPFIND
with depth infinity for the property getlastmodified. This
requires only one round-trip. When I do this on a collection
which returns about 5000 resources the result is 870KB. That is
too large indeed. With compression on, however, it is below 20KB.
This is no longer a scalability problem.

The implementation of this specification introduces a lot of
bookkeeping at the server side. I doubt if it is worth the gain
in reality.

About the spec itself:

3. Open Issues

1. Yes. This can save a GET if the reason for the update was an
    update of only properties.

2. Yes, otherwise you would have to fetch all properties.

3. This seems dangerous. This is only useful to try to save a GET
    by mimicking the operation at the client side. This cannot be
    done accurately in all cases, because the client has no control
    over the way properties are copied, for example.

4. Only if you support the MOVE in item 3.

5. Losing the "read" privilege on a resource doesn't imply its
    presence can no longer be seen in the collection it is in.
    In some implementations it does, but then it is still not
    correct to return a deletion, because the client could later
    try to recreate the resource. This would have to fail because
    the invisible resource still exists.

6. In a versioning system deletion followed by recreation can't be
    considered as the change of an existing resource. The deleted
    one may be removed from the collection, but it might still exist.

5.1.3. DAV:sync-response XML Element

Why is this element introduced? You can make sync-token optional and
stick with "response". That expresses as much as the modified multistatus
content model.

Regards,

Werner.

Cyrus Daboo wrote:

>
> Hi folks,
> FYI. Comments welcome.
>
> A New Internet-Draft is available from the on-line Internet-Drafts
> directories.
>
>
>     Title        : Collection Synchronization for WebDAV
>     Author(s)    : C. Daboo
>     Filename    : draft-daboo-webdav-sync-00.txt
>     Pages        : 14
>     Date        : 2007-7-3
>    
>
>   This specification defines an extension to WebDAV that allows
>   efficient synchronization of the contents of a WebDAV collection.
>
>
> A URL for this Internet-Draft is:
> http://www.ietf.org/internet-drafts/draft-daboo-webdav-sync-00.txt
>
>

--
Werner DonnĂ©  --  Re
Engelbeekstraat 8
B-3300 Tienen
tel: (+32) 486 425803 e-mail: werner.donne@...


Re: Collection Synchronization for WebDAV

by Mr. Demeanour :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Cyrus Daboo wrote:

>
> Hi folks, FYI. Comments welcome.
>
> A New Internet-Draft is available from the on-line Internet-Drafts
> directories.
>
>
> Title        : Collection Synchronization for WebDAV Author(s)    :
> C. Daboo Filename    : draft-daboo-webdav-sync-00.txt Pages        :
> 14 Date        : 2007-7-3
>
>
> This specification defines an extension to WebDAV that allows
> efficient synchronization of the contents of a WebDAV collection.
>
>
> A URL for this Internet-Draft is:
> http://www.ietf.org/internet-drafts/draft-daboo-webdav-sync-00.txt
>
>

Section 1:
> However this does not scale well to large collections as the XML
> response to the PROPFIND response will grow with the collection size.
>
The implication is that the response to a <sync-collection> report
*won't* grow with the collection size. I think what you're trying to say
is that the size of the response to PROPFIND is proportional to the
number of resources in the collection, whereas the size of the response
to <sync-collection> is proportional to the number of *changed*
resources in the collection. The new report might be smaller than the
propfind, but it isn't necessarily going to scale any better, depending
on the nature of activity in the collection.


Section 4.1, para 3:

> The is specification

Typo.


Section 4.2:

> A simple implementation of such a token would be a numeric counter
> that counts each change as it occurs and relates that change to the
> specific object that changed.

It's hard to see how a 'numeric counter' can relate anything to anything
else - a counter can only count. However I can imagine implementing the
token as an index into a register of changes; the server would maintain
a changelog, and a given token would select all changes that have
occurred beyond a certain point.


> The request body MUST be a DAV:sync-collection XML element (see
> Section 5.1, which MUST contain one DAV:sync-token XML element, and
> optionally a DAV:propstat XML element.

That should read "and optionally a dav:prop XML element.", for
consistency with your examples and with the DTD fragments.

> A status code of '201 Created' is used to indicate resources that are
> new.

It's not obvious to me that this is necessary; a client can tell that a
resource is new if it has a <sync-response> element in the report, and
the resource doesn't exist in the client's cache. Allowing a 200 status
in such cases might simplify some server implementations.

Open issues:

1. Distinguishing between property changes and content changes struck me
    immediately as something that was lacking. If you don't distinguish
    between them, then a client must make a minimum of two requests to
    ensure that it is up-to-date. In the case of a CalDAV server and a
    calendar-collection, the client could update either or both the
    cached properties and the content in a single transaction (using a
    <calendar-query> report).


5. Losing <read> on a resource doesn't necessarily prevent you from
    being able to see it; it just prevents you from being able to GET it
    (I think that is implementation-dependent). That a resource exists
    but cannot be read, could be signalled using a 401. If <read>
    controls the ability to see a resource at all, then I would expect
    the loss of <read> to have the same effect as the removal of the
    resource - 404.
    But that raises the question of what to do with newly-created
    resources that you can see, but can't read; I suppose that should
    result in a 401 for the new resource.
    Consider a server that implements a <read-properties> privilege. If
    the user has lost <read-properties> on some resource, and the
    properties themselves have changed, then presumably the client should
    delete properties from its cache.
    In general, it seems to me that the interaction of this spec with ACL
    is going to be difficult, because ACL grants implementors a lot of
    latitude. The various cases need to be worked out carefully.

--
Jack.


Re: Collection Synchronization for WebDAV

by Arnaud Quillaud :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some comments:

1) It is specified that "The "Depth" header MUST be ignored by the server and SHOULD NOT be sent by the client". But, unless I missed it, there is no mention of the actual depth of the report. I'm assuming it is 1 but maybe that is not what you had in mind. It would be worth making it clear.

2) There is no mention of direct subcollections of the request-uri:
- Would a newly created/deleted subcollection appear in the report ?
- What does it mean for a subcollection to be modified ?

3) redefinition of multistatus to include a sync-response and sync-token: is it really needed ?
The sync-token could be passed back and forth using an HTTP header (e.g. SyncToken: and if-SyncToken-Match:) instead of an xml element.
The sync-response could be avoided by not distinguishing created and modified resources. The client can do that job easily (?...).

Arnaud Q




Cyrus Daboo-2 wrote:
Hi folks,
FYI. Comments welcome.

A New Internet-Draft is available from the on-line Internet-Drafts
directories.


        Title : Collection Synchronization for WebDAV
        Author(s) : C. Daboo
        Filename : draft-daboo-webdav-sync-00.txt
        Pages : 14
        Date : 2007-7-3
       

   This specification defines an extension to WebDAV that allows
   efficient synchronization of the contents of a WebDAV collection.


A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-daboo-webdav-sync-00.txt


--
Cyrus Daboo


Re: Collection Synchronization for WebDAV

by Julian Reschke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Arnaud Quillaud wrote:
> Some comments:
>
> 1) It is specified that "The "Depth" header MUST be ignored by the server
> and SHOULD NOT be sent by the client". But, unless I missed it, there is no
> mention of the actual depth of the report. I'm assuming it is 1 but maybe
> that is not what you had in mind. It would be worth making it clear.
> ...

Agreed. Please stay compatible with the REPORT framework. That is,

- state what the Depth is -- sounds like 1,

- require the client to send that when not the default (which is 0 for
REPORT),

- require the server to check the Depth.

That's necessary for

1) Consistency with other reports (less special cases), and

2) Leaving the door open for future extensions (with different depths).

Best regards, Julian


Re: Collection Synchronization for WebDAV

by Julian Reschke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Julian Reschke wrote:

>
> Arnaud Quillaud wrote:
>> Some comments:
>>
>> 1) It is specified that "The "Depth" header MUST be ignored by the server
>> and SHOULD NOT be sent by the client". But, unless I missed it, there
>> is no
>> mention of the actual depth of the report. I'm assuming it is 1 but maybe
>> that is not what you had in mind. It would be worth making it clear.
>> ...
>
> Agreed. Please stay compatible with the REPORT framework. That is,
> ...

Speaking of which, please make sure to stay compatible with RFC3253, 3.6
(<http://greenbytes.de/tech/webdav/rfc3253.html#rfc.section.3.6>):

"If a Depth request header is included, the response MUST be a 207
Multi-Status. The request MUST be applied separately to the collection
itself and to all members of the collection that satisfy the Depth
value. The DAV:prop element of a DAV:response for a given resource MUST
contain the requested report for that resource."

That essentially means that if the response format for Depth:0 is a
DAV:multistatus, the result for Depth:1 will be many multistatus bodies
embedded into a multistatus container element.

I guess it's really time to extract the definition of the REPORT method
from RFC3253, and move it it a separate doc with lots of examples...
(will start on it soon).

Best regards, Julian


Re: Collection Synchronization for WebDAV

by Werner Baumann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


I had to read the draft three times, because it is not really clear
about this. But finally I think, this REPORT is meant to be always of
Depth infinity.
This is related to the definition of the synchronization token in 4.2:
 > The server will track each change and
 > provide a synchronization "token" to the client that describes the
 > state of the server at a specific point in time.

This makes sense, if clients always always request a REPORT for the top
level collection (Depth infinity) or at least for the highest level
collection, that the client is interested in.

If a client could do REPORT requests of any depth:
The "token" it gets will represent the *state of the whole server* at
some time, but only part of the cached information of the client would
be synchronized. If the client some time later, wants to request a
REPORT on some other part of the cached information (not included in the
first report), it cannot use this token. So the client would have to
save the token for every resource separately. This would not be bad in
itself. Only: if it wants to synchronize part of its cache, where
different resources have different synchronization token associated, it
is impossible to evaluate, which is the oldest one, because the draft
insists, the token has to be
 > an "opaque" string - i.e. the
 > actual string data has no specific meaning or syntax.

The client will have to do a full REPORT request for the top level
collection (Depth infinity) in this case.

I believe, the draft only makes sense, when REPORTS are always Depth
infinity for the top level collection. This seems to violate RFC3253 (as
I understand Julian). For many clients this will also produce far more
unnecessary traffic, than might be saved by reporting only changes.

I also cannot understand, why it is important or even desirable, for a
token, to have "no specific meaning".

 > 4.1.  Overview
 >  In order to synchronize data between two entities some form of
 >  synchronization token is required to define the state of the data to
 >  be synchronized at a particular point in time.  That token can then
 >  be used to determine what has changed since that time and the
 >  current time.

To reference to the state of data *at a particular point in time* to get
information *what has changed since that time*, the most natural choice
for a token is that particular *point in time*. Why remove its meaning
and the order?
But this can not be the HTTP-time. The resolution must be far better
(nanosecond should be possible), so that the server can make sure, that
  no more than one state change occurs within one time interval.

I believe, the problem of synchronization is strongly related to
unresolved questions in the basic WebDAV protocol (RFC 4918):
- Last modified time for properties and collections is undefined
- Etag for collections is undefined
- as consequence thereof: it is impossible to define the meaning of
conditional PROPFIND requests.

Instead of suggesting new REPORTS, tokens and elements, these open
question should be resolved. I am sure, this would enable conditional
PROPFIND of any depth for efficient synchroniziation of cached data.

Cheers
Werner



PROPFIND/REPORT on non existing resource/collection

by Arnaud Quillaud :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hello,

When issuing a PROPFIND or REPORT with a Request URI that does not exist, what should be returned:
- a 404 Not Found ?
- or a 207 multistatus containing a single response + href + status = 404 (e.g. <DAV:response><DAV:href>/toto/</DAV:href><DAV:status>404 Not Found</DAV:status></DAV:response>) ?

The only indication that I have found comes from the REPORT method definition where it is stated (http://tools.ietf.org/html/rfc3253#section-3.6): "If a Depth request header is included, the response MUST be a 207 Multi-Status.".

Thanks,

Arnaud Q





Re: PROPFIND/REPORT on non existing resource/collection

by Julian Reschke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Arnaud Quillaud wrote:
> Hello,
>
> When issuing a PROPFIND or REPORT with a Request URI that does not exist, what should be returned:
> - a 404 Not Found ?

Yes.

> - or a 207 multistatus containing a single response + href + status = 404 (e.g. <DAV:response><DAV:href>/toto/</DAV:href><DAV:status>404 Not Found</DAV:status></DAV:response>) ?

Should be equivalent.

> The only indication that I have found comes from the REPORT method definition where it is stated (http://tools.ietf.org/html/rfc3253#section-3.6): "If a Depth request header is included, the response MUST be a 207 Multi-Status.".

I guess we need to clarify that when we move REPORT into a separate spec.

Best regards, Julian