|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 | Next > |
|
|
Multi-server HTTPHi all,
At the IETF this week, Mark Handley and I submitted a floating-an-idea draft on multi-server HTTP and presented it in tsvarea. http://www.ietf.org/id/draft-ford-http-multi-server-00.txt Slides are at: http://www.ietf.org/proceedings/75/slides/tsvarea-0.pdf I realise Transport Area didn't capture a large number of HTTP people - the main reason for presenting it there was our key motivation was to improve Internet resource usage, and we have been doing other such work (notably multipath TCP) in that area. We were also very short on preparation time before the IETF - so apologies for missing many of you guys. However, we would very much like input and guidance from the HTTP community. I am grateful to Henrik Nordstrom for suggesting we should bring it to the HTTPbis WG, even though as an extension it is not within the charter. This is a brief summary of the proposal: * We are aiming to achieve better usage of Internet resources by applying BitTorrent-like chunked downloading of large files from different servers. * Upon connection to a Multi-Server HTTP server, when a client says they are Multi-server capable, in the response the server will provide a list of mirrors for that resource, a checksum for the file, and a chunk of the file with a Content-Range header. * The client will then send more GET requests, this time with Range: headers, to the original server and to zero or more of the mirror servers, along with a verification header to ensure the checksum matches and so the resource is the same. The client will handle the scheduling of Range requests in order to make the most effective use of the least loaded servers. We realise that the draft itself is not making the best use of existing proposals. During the presentation, Instance-Digests (RFC3230) were mentioned which look ideal instead of X-Checksum, although we will still need an If-Digest-Match header. Content-MD5 was also suggested but that appears to be a checksum of just the data that is sent, not the whole resource. I discounted ETags along with If-Match in the proposal since RFC2616 says "Entity tags are used for comparing two or more entities from the same requested resource" but if I have understood the terminology correctly, in our proposal we are fetching chunks from different resources (even though the content should be the same). Indeed the RFC also says, "The use of the same entity tag value in conjunction with entities obtained by requests on different URIs does not imply the equivalence of those entities." Please correct me if I'm wrong! There is also a question of whether we could make further extensions, specifically: * Wildcarded mirror lists (e.g. a server that mirrors all /a/*.jpg). * Checksums could be provided for file chunks allowing broken chunks to be re-fetched. * Servers could store multiple versions of the file indexed by checksum. * Initial servers could send no, or very little, data itself, and purely act as a load balancer; or redirect immediately when it's overloaded. These may change the mechanism quite considerably, however (e.g. with wildcards, no longer would you be getting all checksums from the same server; and for verification checksum chunks need to be pre-determined and calculated). We believe that the extension as it stands can bring significant benefit to HTTP, making much more efficient use of Internet resources. Experiments have been conducted that suggest it has no negative impact in every scenario in which it was tested. Looking forward to your comments and advice! Regards, Alan ------------------------------------------------------------------------ Alan Ford Tel: +44 (0)1794 833465 Fax: +44 (0)1794 833433 alan.ford@... -- Roke Manor Research Ltd, Romsey, Hampshire, SO51 0ZN, United Kingdom A Siemens company Registered in England & Wales at: Siemens plc, Faraday House, Sir William Siemens Square, Frimley, Camberley, GU16 8QD. Registered No: 267550 ------------------------------------------------------------------------ Visit our website at www.roke.co.uk ------------------------------------------------------------------------ The information contained in this e-mail and any attachments is proprietary to Roke Manor Research Ltd and must not be passed to any third party without permission. This communication is for information only and shall not create or change any contractual relationship. ------------------------------------------------------------------------ Please consider the environment before printing this email |
|
|
Re: Multi-server HTTPAlan and Mark,
There unfortunately hasn't been much discussion of this yet, at least on the list. Has there been progress elsewhere? For my part, this looks like interesting work. If I understand it correctly, it's entirely application-layer (or at least able to be implemented within the application layer), so if you want to, I think it's entirely appropriate to discuss it on this list. Also, have you made contact with the folks doing Metalink <http://www.metalinker.org/ >? They have deployed implementations, and it's my understanding that they're looking at revising the spec now, so it may an excellent time to collaborate. Personally, I'd like to see the end result able to use the same URL for multi-server downloads and "traditional" single-server downloads; i.e., it should be transparent to clients. Cheers, On 31/07/2009, at 9:59 PM, Ford, Alan wrote: > Hi all, > > At the IETF this week, Mark Handley and I submitted a floating-an-idea > draft on multi-server HTTP and presented it in tsvarea. > > http://www.ietf.org/id/draft-ford-http-multi-server-00.txt > > Slides are at: http://www.ietf.org/proceedings/75/slides/tsvarea-0.pdf > > I realise Transport Area didn't capture a large number of HTTP > people - > the main reason for presenting it there was our key motivation was to > improve Internet resource usage, and we have been doing other such > work > (notably multipath TCP) in that area. We were also very short on > preparation time before the IETF - so apologies for missing many of > you > guys. > > However, we would very much like input and guidance from the HTTP > community. I am grateful to Henrik Nordstrom for suggesting we should > bring it to the HTTPbis WG, even though as an extension it is not > within > the charter. > > This is a brief summary of the proposal: > > * We are aiming to achieve better usage of Internet resources by > applying BitTorrent-like chunked downloading of large files from > different servers. > * Upon connection to a Multi-Server HTTP server, when a client says > they are Multi-server capable, in the response the server will > provide a > list of mirrors for that resource, a checksum for the file, and a > chunk > of the file with a Content-Range header. > * The client will then send more GET requests, this time with Range: > headers, to the original server and to zero or more of the mirror > servers, along with a verification header to ensure the checksum > matches > and so the resource is the same. The client will handle the scheduling > of Range requests in order to make the most effective use of the least > loaded servers. > > We realise that the draft itself is not making the best use of > existing > proposals. During the presentation, Instance-Digests (RFC3230) were > mentioned which look ideal instead of X-Checksum, although we will > still > need an If-Digest-Match header. Content-MD5 was also suggested but > that > appears to be a checksum of just the data that is sent, not the whole > resource. > > I discounted ETags along with If-Match in the proposal since RFC2616 > says "Entity tags are used for comparing two or more entities from the > same requested resource" but if I have understood the terminology > correctly, in our proposal we are fetching chunks from different > resources (even though the content should be the same). Indeed the RFC > also says, "The use of the same entity tag value in conjunction with > entities obtained by requests on different URIs does not imply the > equivalence of those entities." Please correct me if I'm wrong! > > There is also a question of whether we could make further extensions, > specifically: > > * Wildcarded mirror lists (e.g. a server that mirrors all /a/*.jpg). > * Checksums could be provided for file chunks allowing broken chunks > to be re-fetched. > * Servers could store multiple versions of the file indexed by > checksum. > * Initial servers could send no, or very little, data itself, and > purely act as a load balancer; or redirect immediately when it's > overloaded. > > These may change the mechanism quite considerably, however (e.g. with > wildcards, no longer would you be getting all checksums from the same > server; and for verification checksum chunks need to be pre-determined > and calculated). > > We believe that the extension as it stands can bring significant > benefit > to HTTP, making much more efficient use of Internet resources. > Experiments have been conducted that suggest it has no negative impact > in every scenario in which it was tested. > > Looking forward to your comments and advice! > > Regards, > Alan > > ------------------------------------------------------------------------ > Alan Ford > > Tel: +44 (0)1794 833465 > Fax: +44 (0)1794 833433 > alan.ford@... > > > -- > Roke Manor Research Ltd, Romsey, > Hampshire, SO51 0ZN, United Kingdom > > A Siemens company > Registered in England & Wales at: > Siemens plc, Faraday House, Sir William Siemens Square, > Frimley, Camberley, GU16 8QD. Registered No: 267550 > ------------------------------------------------------------------------ > Visit our website at www.roke.co.uk > ------------------------------------------------------------------------ > The information contained in this e-mail and any attachments is > proprietary to Roke Manor Research Ltd and must not be passed to any > third party without permission. This communication is for information > only and shall not create or change any contractual relationship. > ------------------------------------------------------------------------ > > Please consider the environment before printing this email > > -- Mark Nottingham http://www.mnot.net/ |
|
|
RE: Multi-server HTTPHi Mark, all,
Thanks for your response. Unfortunately we have not received any further feedback on this, which is a same since we'd really like to know if there is interest in trying to move this forward. I have (admittedly only briefly) looked at metalink. It seems to cover some of what we need (list of mirrors, pieces, checksumming) but seems mostly to be concerned with finding a single appropriate source rather than downloading from multiple HTTP servers. This seems to mostly be a client rather than a spec choice, however. Nevertheless, one of the disadvantages of metalink, from our point of view, is that it is an overhead. This is negligible for large files, but one of our (longer term) use cases is for mirrors of a whole site allowing e.g. a set of images to be downloaded from different servers. As such, there is a moderate delay before a download would start since first the metalink must be downloaded, then decisions made, then new downloads started. In our case, the download starts immediately, just as in standard HTTP, and the client can take over the requesting of various parts when it is ready, so there is no delay introduced by metadata handshaking. Our solution is indeed designed to operate on the same URLs as It seems that it is feasible for metalink to also be done transparently (by the client declaring "Accept: application/metalink+xml" as I understand it). So, folks... any more thoughts? :) Regards, Alan > -----Original Message----- > From: Mark Nottingham [mailto:mnot@...] > Sent: 25 August 2009 07:14 > To: Ford, Alan > Cc: ietf-http-wg@...; Mark Handley > Subject: Re: Multi-server HTTP > > Alan and Mark, > > There unfortunately hasn't been much discussion of this yet, at least > on the list. Has there been progress elsewhere? > > For my part, this looks like interesting work. If I understand it > correctly, it's entirely application-layer (or at least able to be > implemented within the application layer), so if you want to, I think > it's entirely appropriate to discuss it on this list. > > Also, have you made contact with the folks doing Metalink > <http://www.metalinker.org/ > >? They have deployed implementations, and it's my understanding that > they're looking at revising the spec now, so it may an excellent time > to collaborate. > > Personally, I'd like to see the end result able to use the same URL > for multi-server downloads and "traditional" single-server downloads; > i.e., it should be transparent to clients. > > Cheers, > > > On 31/07/2009, at 9:59 PM, Ford, Alan wrote: > > > Hi all, > > > > At the IETF this week, Mark Handley and I submitted a > > draft on multi-server HTTP and presented it in tsvarea. > > > > http://www.ietf.org/id/draft-ford-http-multi-server-00.txt > > > > Slides are at: http://www.ietf.org/proceedings/75/slides/tsvarea-0.pdf > > > > I realise Transport Area didn't capture a large number of HTTP > > people - > > the main reason for presenting it there was our key motivation was to > > improve Internet resource usage, and we have been doing other such > > work > > (notably multipath TCP) in that area. We were also very short on > > preparation time before the IETF - so apologies for missing many of > > you > > guys. > > > > However, we would very much like input and guidance from the HTTP > > community. I am grateful to Henrik Nordstrom for suggesting we should > > bring it to the HTTPbis WG, even though as an extension it is not > > within > > the charter. > > > > This is a brief summary of the proposal: > > > > * We are aiming to achieve better usage of Internet resources by > > applying BitTorrent-like chunked downloading of large files from > > different servers. > > * Upon connection to a Multi-Server HTTP server, when a client says > > they are Multi-server capable, in the response the server will > > provide a > > list of mirrors for that resource, a checksum for the file, and a > > chunk > > of the file with a Content-Range header. > > * The client will then send more GET requests, this time with > > headers, to the original server and to zero or more of the mirror > > servers, along with a verification header to ensure the checksum > > matches > > and so the resource is the same. The client will handle the scheduling > > of Range requests in order to make the most effective use of the least > > loaded servers. > > > > We realise that the draft itself is not making the best use of > > existing > > proposals. During the presentation, Instance-Digests (RFC3230) were > > mentioned which look ideal instead of X-Checksum, although we will > > still > > need an If-Digest-Match header. Content-MD5 was also suggested but > > that > > appears to be a checksum of just the data that is sent, not the > > resource. > > > > I discounted ETags along with If-Match in the proposal since RFC2616 > > says "Entity tags are used for comparing two or more entities from the > > same requested resource" but if I have understood the terminology > > correctly, in our proposal we are fetching chunks from different > > resources (even though the content should be the same). Indeed the RFC > > also says, "The use of the same entity tag value in conjunction with > > entities obtained by requests on different URIs does not imply the > > equivalence of those entities." Please correct me if I'm wrong! > > > > There is also a question of whether we could make further extensions, > > specifically: > > > > * Wildcarded mirror lists (e.g. a server that mirrors all /a/*.jpg). > > * Checksums could be provided for file chunks allowing broken chunks > > to be re-fetched. > > * Servers could store multiple versions of the file indexed by > > checksum. > > * Initial servers could send no, or very little, data itself, and > > purely act as a load balancer; or redirect immediately when it's > > overloaded. > > > > These may change the mechanism quite considerably, however (e.g. with > > wildcards, no longer would you be getting all checksums from the same > > server; and for verification checksum chunks need to be pre-determined > > and calculated). > > > > We believe that the extension as it stands can bring significant > > benefit > > to HTTP, making much more efficient use of Internet resources. > > Experiments have been conducted that suggest it has no negative impact > > in every scenario in which it was tested. > > > > Looking forward to your comments and advice! > > > > Regards, > > Alan > > > > ------------------------------------------------------------------------ > > Alan Ford > > > > Tel: +44 (0)1794 833465 > > Fax: +44 (0)1794 833433 > > alan.ford@... > > > > > > -- > > Roke Manor Research Ltd, Romsey, > > Hampshire, SO51 0ZN, United Kingdom > > > > A Siemens company > > Registered in England & Wales at: > > Siemens plc, Faraday House, Sir William Siemens Square, > > Frimley, Camberley, GU16 8QD. Registered No: 267550 > > > > Visit our website at www.roke.co.uk > > ------------------------------------------------------------------------ > > The information contained in this e-mail and any attachments is > > proprietary to Roke Manor Research Ltd and must not be passed to any > > third party without permission. This communication is for information > > only and shall not create or change any contractual relationship. > > ------------------------------------------------------------------------ > > > > Please consider the environment before printing this email > > > > > > > -- > Mark Nottingham http://www.mnot.net/ -- Roke Manor Research Ltd, Romsey, Hampshire, SO51 0ZN, United Kingdom A Siemens company Registered in England & Wales at: Siemens plc, Faraday House, Sir William Siemens Square, Frimley, Camberley, GU16 8QD. Registered No: 267550 ------------------------------------------------------------------------ Visit our website at www.roke.co.uk ------------------------------------------------------------------------ The information contained in this e-mail and any attachments is proprietary to Roke Manor Research Ltd and must not be passed to any third party without permission. This communication is for information only and shall not create or change any contractual relationship. ------------------------------------------------------------------------ Please consider the environment before printing this email |
|
|
|
|
|
Re: Multi-server HTTPOn Tue, Aug 25, 2009 at 5:24 AM, Ford, Alan<alan.ford@...> wrote:
> Hi Mark, all, > > I have (admittedly only briefly) looked at metalink. It seems to cover > some of what we need (list of mirrors, pieces, checksumming) but seems > mostly to be concerned with finding a single appropriate source rather > than downloading from multiple HTTP servers. This seems to mostly be a > client rather than a spec choice, however. Nevertheless, one of the This wasn't really a spec choice, more inadequacy of explaining of what metalink offers in the abstract and introduction of our ID. :) Looking at our ID, it doesn't really spell out what we've solved in the past 4 years to those unfamiliar with metalink. Our ID is focused more on the format, not on what the client does with it. All but a few of the 30 some metalink clients support downloading from multiple HTTP servers. That is, clients aren't required to support multi-source downloads. But, most metalink clients are download managers / accelerators. I think using mirrors for fallback / failover is just as important though. See http://en.wikipedia.org/wiki/Metalink or (our embarrassing) http://www.metalinker.org/implementation.html Your excellent introduction put ours to shame, so I've tried to update ours: All the information about a download, including mirrors, checksums, digital signatures, and more can be stored in a machine-readable Metalink file. This Metalink file transfers the knowledge of the download server (and mirror database) to the client. Clients can fallback to alternate mirrors if the current one has an issue. With this knowledge, the client is enabled to work its way to a successful download even under adverse circumstances. All this is done transparently to the user and the download is much more reliable and efficient. In contrast, a traditional HTTP redirect to a mirror conveys only extremely minimal information - one link to one server, and there is no provision in the HTTP protocol to handle failures. Other features that some clients provide include multi-source downloads, where chunks of a file are downloaded from multiple mirrors (and optionally, Peer-to-Peer) simultaneously, which frequently results in a faster download. Metalinks also provide structured information about downloads that can be indexed by search engines. http://tools.ietf.org/html/draft-bryan-metalink#section-1 I should note though that metalink requires no changes to a server. A user can create a metalink. > disadvantages of metalink, from our point of view, is that it is an > overhead. This is negligible for large files, but one of our (longer > term) use cases is for mirrors of a whole site allowing e.g. a set of > images to be downloaded from different servers. As such, there is a > moderate delay before a download would start since first the metalink > must be downloaded, then decisions made, then new downloads started. We could add this if people want. No one had requested it. > In our case, the download starts immediately, just as in standard HTTP, > and the client can take over the requesting of various parts when it is > ready, so there is no delay introduced by metadata handshaking. Downloads with metalink start immediately as well. > Our solution is indeed designed to operate on the same URLs as It seems > that it is feasible for metalink to also be done transparently (by the > client declaring "Accept: application/metalink+xml" as I understand it). Yes, we've been experimentally using transparent content negotiation, which we have since learned is bad. :) We'll be using Mark's Link header in the future. -- (( Anthony Bryan ... Metalink [ http://www.metalinker.org ] )) Easier, More Reliable, Self Healing Downloads >> -----Original Message----- >> From: Mark Nottingham [mailto:mnot@...] >> Sent: 25 August 2009 07:14 >> To: Ford, Alan >> Cc: ietf-http-wg@...; Mark Handley >> Subject: Re: Multi-server HTTP >> >> Alan and Mark, >> >> There unfortunately hasn't been much discussion of this yet, at least >> on the list. Has there been progress elsewhere? >> >> For my part, this looks like interesting work. If I understand it >> correctly, it's entirely application-layer (or at least able to be >> implemented within the application layer), so if you want to, I think >> it's entirely appropriate to discuss it on this list. >> >> Also, have you made contact with the folks doing Metalink >> <http://www.metalinker.org/ >> >? They have deployed implementations, and it's my understanding that >> they're looking at revising the spec now, so it may an excellent time >> to collaborate. >> >> Personally, I'd like to see the end result able to use the same URL >> for multi-server downloads and "traditional" single-server downloads; >> i.e., it should be transparent to clients. >> >> Cheers, >> >> >> On 31/07/2009, at 9:59 PM, Ford, Alan wrote: >> >> > Hi all, >> > >> > At the IETF this week, Mark Handley and I submitted a > floating-an-idea >> > draft on multi-server HTTP and presented it in tsvarea. >> > >> > http://www.ietf.org/id/draft-ford-http-multi-server-00.txt >> > >> > Slides are at: > http://www.ietf.org/proceedings/75/slides/tsvarea-0.pdf >> > >> > I realise Transport Area didn't capture a large number of HTTP >> > people - >> > the main reason for presenting it there was our key motivation was > to >> > improve Internet resource usage, and we have been doing other such >> > work >> > (notably multipath TCP) in that area. We were also very short on >> > preparation time before the IETF - so apologies for missing many of >> > you >> > guys. >> > >> > However, we would very much like input and guidance from the HTTP >> > community. I am grateful to Henrik Nordstrom for suggesting we > should >> > bring it to the HTTPbis WG, even though as an extension it is not >> > within >> > the charter. >> > >> > This is a brief summary of the proposal: >> > >> > * We are aiming to achieve better usage of Internet resources by >> > applying BitTorrent-like chunked downloading of large files from >> > different servers. >> > * Upon connection to a Multi-Server HTTP server, when a client says >> > they are Multi-server capable, in the response the server will >> > provide a >> > list of mirrors for that resource, a checksum for the file, and a >> > chunk >> > of the file with a Content-Range header. >> > * The client will then send more GET requests, this time with > Range: >> > headers, to the original server and to zero or more of the mirror >> > servers, along with a verification header to ensure the checksum >> > matches >> > and so the resource is the same. The client will handle the > scheduling >> > of Range requests in order to make the most effective use of the > least >> > loaded servers. >> > >> > We realise that the draft itself is not making the best use of >> > existing >> > proposals. During the presentation, Instance-Digests (RFC3230) were >> > mentioned which look ideal instead of X-Checksum, although we will >> > still >> > need an If-Digest-Match header. Content-MD5 was also suggested but >> > that >> > appears to be a checksum of just the data that is sent, not the > whole >> > resource. >> > >> > I discounted ETags along with If-Match in the proposal since RFC2616 >> > says "Entity tags are used for comparing two or more entities from > the >> > same requested resource" but if I have understood the terminology >> > correctly, in our proposal we are fetching chunks from different >> > resources (even though the content should be the same). Indeed the > RFC >> > also says, "The use of the same entity tag value in conjunction with >> > entities obtained by requests on different URIs does not imply the >> > equivalence of those entities." Please correct me if I'm wrong! >> > >> > There is also a question of whether we could make further > extensions, >> > specifically: >> > >> > * Wildcarded mirror lists (e.g. a server that mirrors all > /a/*.jpg). >> > * Checksums could be provided for file chunks allowing broken > chunks >> > to be re-fetched. >> > * Servers could store multiple versions of the file indexed by >> > checksum. >> > * Initial servers could send no, or very little, data itself, and >> > purely act as a load balancer; or redirect immediately when it's >> > overloaded. >> > >> > These may change the mechanism quite considerably, however (e.g. > with >> > wildcards, no longer would you be getting all checksums from the > same >> > server; and for verification checksum chunks need to be > pre-determined >> > and calculated). >> > >> > We believe that the extension as it stands can bring significant >> > benefit >> > to HTTP, making much more efficient use of Internet resources. >> > Experiments have been conducted that suggest it has no negative > impact >> > in every scenario in which it was tested. >> > >> > Looking forward to your comments and advice! >> > >> > Regards, >> > Alan |
|
|
RE: Multi-server HTTPtis 2009-08-25 klockan 10:24 +0100 skrev Ford, Alan:
> Well that would certainly be a great solution for many use cases. If > there was a distributed set of virtual hosts of a given server, I can > see that working quite well. Plus, this would solve the ETag issue that > I mention (assuming I have understood correctly - nobody has yet > corrected me!), since in this case the client /is/ requesting the same > resource from each IP address. ETag is per URI, but it is entirely fine for a specification like this to require that the participating servers use the same ETag for the same object version, for example based on an hash of the object data. How servers compose ETag is outside of HTTP specification and a property of the server implementation, only requirement HTTP places is uniqueness among versions or variants of the same URI. The base HTTP specifications places no direct requirements on how ETag from different URIs relate to each other, but do hint that for objects having multiple URIs where those URIs are equal it's expected the ETag would also be the same. However, many server implementations available today do not easily allow this on the wide scale you require without additions on the server, as they base ETag on other metadata that may differ between mirrors of the same object such as local file timestamp, filesystem inode numbers etc, not really taking the actual content of the object into consideration. Regards Henrik |
|
|
RE: Multi-server HTTPHi Henrik, all,
Thanks for the clarification, so it seems we could in theory define the ETag for this specification to ensure it matches across servers. That would remove the need for all except the Mirrors: header, and possibly Multiserver-Version (so that the server knows it's talking to a multiserver-capable client and thus the ETag is defined this way). If we didn't mind a small delay, we could probably do away with that too and say the client could infer capability by getting a Mirrors: header back from a HEAD request first, and then deciding what to do (assuming the connection can be kept alive). Which brings me onto another thing about Mirrors: header. One of our longer-term goals with this would be to somehow provide wildcarded lists of mirrors, so that a client could immediately run off and fetch bits of a website from many mirrors, potentially speeding up loading time considerably, and providing an alternative method of load balancing. However, I'm struggling to see a neat way of doing this reliably, since we couldn't get checksums for every file on the first handshake (or if all content was static we might be able to, but it's a big overhead). Does anybody have any ideas as to a neat way of doing this? Best I can think of so far is some sort of version number/(pseudo)hash of the entire directory structure! Regards, Alan > -----Original Message----- > From: Henrik Nordstrom [mailto:henrik@...] > Sent: 26 August 2009 21:30 > To: Ford, Alan > Cc: Robert Siemer; Mark Nottingham; ietf-http-wg@...; Mark Handley > Subject: RE: Multi-server HTTP > > tis 2009-08-25 klockan 10:24 +0100 skrev Ford, Alan: > > Well that would certainly be a great solution for many use cases. If > > there was a distributed set of virtual hosts of a given server, I > > see that working quite well. Plus, this would solve the ETag issue that > > I mention (assuming I have understood correctly - nobody has yet > > corrected me!), since in this case the client /is/ requesting the same > > resource from each IP address. > > ETag is per URI, but it is entirely fine for a specification like this > to require that the participating servers use the same ETag for the same > object version, for example based on an hash of the object data. How > servers compose ETag is outside of HTTP specification and a property of > the server implementation, only requirement HTTP places is uniqueness > among versions or variants of the same URI. The base HTTP specifications > places no direct requirements on how ETag from different URIs relate to > each other, but do hint that for objects having multiple URIs where > those URIs are equal it's expected the ETag would also be the same. > > However, many server implementations available today do not easily allow > this on the wide scale you require without additions on the server, as > they base ETag on other metadata that may differ between mirrors of the > same object such as local file timestamp, filesystem inode numbers etc, > not really taking the actual content of the object into consideration. > > Regards > Henrik -- Roke Manor Research Ltd, Romsey, Hampshire, SO51 0ZN, United Kingdom A Siemens company Registered in England & Wales at: Siemens plc, Faraday House, Sir William Siemens Square, Frimley, Camberley, GU16 8QD. Registered No: 267550 ------------------------------------------------------------------------ Visit our website at www.roke.co.uk ------------------------------------------------------------------------ The information contained in this e-mail and any attachments is proprietary to Roke Manor Research Ltd and must not be passed to any third party without permission. This communication is for information only and shall not create or change any contractual relationship. ------------------------------------------------------------------------ Please consider the environment before printing this email |
|
|
RE: Multi-server HTTPOn Fri, 28 Aug 2009, Ford, Alan wrote:
> the client could infer capability by getting a Mirrors: header back from a > HEAD request first, and then deciding what to do (assuming the connection > can be kept alive). That would work even if the connection isn't kept alive, wouldn't it? > Which brings me onto another thing about Mirrors: header. One of our > longer-term goals with this would be to somehow provide wildcarded lists of > mirrors, so that a client could immediately run off and fetch bits of a > website from many mirrors, potentially speeding up loading time > considerably, and providing an alternative method of load balancing. > > However, I'm struggling to see a neat way of doing this reliably, since we > couldn't get checksums for every file on the first handshake (or if all > content was static we might be able to, but it's a big overhead). Does > anybody have any ideas as to a neat way of doing this? Best I can think of > so far is some sort of version number/(pseudo)hash of the entire directory > structure! This idea is attractive methinks, but coming up with a fine protocol for it is really tricky. A hash of the entire directory would be problematic, I think, since it would imply that both directory structures need to remain identical - not only hold the right files and no extra files. I'm thinking like: you have two sites A and B, they show one picture each A.jpg and B.jpg. Both sites refer to a mirror that holds BOTH those images in the same directory. It could work fine, but the mirror's dir doesn't look the same as the dir of A nor B. That concept would break too easily I think. We want to avoid doing requests to non-existing resources on the mirror that'd respond with a 404 back (which then would have to retried to the master site or another mirror) - we need a decent way for a client to know which URIs it can try to get from a mirror instead of the master... I think all this make me favour not a wildcard concept, but more a list-concept where a site can list not only that "this object also exist HERE and HERE" but then also "THESE OTHER OBJECTS also exist HERE and HERE" and "THESE OTHER" would then be a list of (relative?) URIs somehow. But this becomes awkward if the list of items is long. Then we come to the concept of changing items. How long can a client assume that the mirrors have the corresponding object? Would they need some kind of cache control headers to specify that? In the mirror-for-a-single-object case I think we can assume that the mirror will have the object for at least a very short while after the response said so but then it too gets this problem. -- / daniel.haxx.se |
|
|
RE: Multi-server HTTPfre 2009-08-28 klockan 12:38 +0100 skrev Ford, Alan:
> Multiserver-Version (so that the server knows it's talking to a > multiserver-capable client and thus the ETag is defined this way). Not needed. It's sufficient the server announces the support. In fact strongly recommended it always announces it or you'll run into some hairy issues with caching.. > Which brings me onto another thing about Mirrors: header. One of our > longer-term goals with this would be to somehow provide wildcarded lists > of mirrors, so that a client could immediately run off and fetch bits of > a website from many mirrors, potentially speeding up loading time > considerably, and providing an alternative method of load balancing. That should imho be in a profile which you reference from a header, i.e. by using the Link header referring to a mirror profile. > However, I'm struggling to see a neat way of doing this reliably, since > we couldn't get checksums for every file on the first handshake (or if > all content was static we might be able to, but it's a big overhead). Right.. so the client need to pick one known server (perhaps "at random") as the master server for any given request, giving the needed object metadata, based on whatever prior knowledge it has about the mirror setup. > Does anybody have any ideas as to a neat way of doing this? Best I can > think of so far is some sort of version number/(pseudo)hash of the > entire directory structure! A such hash isn't useful unless you retrieve the complete structure, which most often is not what you want to do. Imho what you can provide in the mirror profile is just the URL patterns where content may be found. Hashes etc have to be resolved per object when fetched. Additionally the list of mirrors can be fairly large, making it unsuitable to be sent in HTTP headers. Consider for example a site with hundreds of mirrors which is not unrealistic (even the little Squid project have in the range of 70 registered and verified mirrors). So I would recommend the following slightly different approach to your problem. * Define a new Mirror profile object, similar to MetaLink but defining the mirror URL policy for groups of URLs on the server, without going into checksums etc (HTTP will give those). * Instance-Digest header returning the object checksum * HTTP addendum that servers participating in this mirror scheme should all share the same ETag policy, i.e. base it on the file contents and not server-unique filesystem metadata.. 1. First request for a mirrored URL. Plain GET requests, perhaps with a Range limit (not required). Client discovers the mirror profile link in the header, and maybe a MetaLink relation as well (the two happily coexists). From this response the client learns the following metadata about the requested object, in addition also starting to receive the object: * ETag * Instance-Digest * Mirror profile link. * Object size * Recovery profile link 2. If the object is large and gets delivered slower than expected then the client fetches the mirror profile, and then starts a number of parallel ranged downloads (one per selected mirror server other than the first) using If-Match conditions based on the ETag to quickly detect out-of-date mirrors. If no Range limit was given in the original request then work from the tail of the object (the first is still running and will eventually catch up), otherwise continue after the range requested in the first request. 2b. If a server rejects the If-Match condition then something is fishy. If the metadata came from the master server or the master server has already acknowledged the validity by accepting an If-Match condition then ignore those other servers rejecting If-Match. If the master server has not yet been queried then pick the master server as fallback for the first failed range. If the master server rejects the If-Match then restart the download from the beginning using the master server for the initial range. 3. If the first request was not Range limited then abort it by closing the connection when it catches up with the other parallel downloads of the same object. 3. On the next requested URL the mirror profile of the server is already known, and the client can pick the server that seems fastest for the initial request, where it will learn the required object-specific metadata (ETag, Size, Instance-Digest, Recovery profile link). 4. If the object checksum does not match the instance-digest then fetch the recovery profile link, where partial checksums etc can be found allowing detection of which server returned bad information. In this approach all servers providing the mirror service SHOULD use the same ETag and preferably also provide an Instance-Digest checksum. It's possible to specify this property of the available servers per server in the mirror profile however, and the modification for servers not sharing the same ETag is that If-Match won't be used for those servers. This slightly increases the risk of a failed transfer, requiring recovery after the download is supposed to be complete.. And at least one of the selected servers need to provide Instance-Digest to be able to detect corrupted transfers. I.e. it's in most cases sufficient that the master server provides mirror profile and instance-digest information, but operation will be more robust and efficient if the mirror servers do implement a common ETag and preferably Instance-Digest as well. In fact the emitted ETag may be implemented as the same as the instance digest for simplicity, but there is no need to specify how ETag generated, just that it needs to be shared among the mirror servers. Regards Henrik |
|
|
Re: Multi-server HTTPOn Fri, Aug 28, 2009 at 11:27 AM, Henrik
Nordstrom<henrik@...> wrote: > fre 2009-08-28 klockan 12:38 +0100 skrev Ford, Alan: > > * Define a new Mirror profile object, similar to MetaLink but defining > the mirror URL policy for groups of URLs on the server, without going > into checksums etc (HTTP will give those). > > * Instance-Digest header returning the object checksum My connection has been down for a few days but here are my very rough ideas on doing Metalink in HTTP headers with the Link header, Instance Digests, and perhaps Content-MD5. Briefly, it's: Link: <http://www2.example.com/example.ext>; rel="alternate"; Link: <ftp://ftp.example.com/example.ext>; rel="alternate"; Link: <http://example.com/example.ext.torrent>; rel="describedby"; type="torrent"; Link: <http://example.com/example.ext.asc>; rel="describedby"; type="application/pgp-signature"; Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5= http://www.ietf.org/id/draft-bryan-metalinkhttp-00.txt I've been meaning to ask Mark Nottingham if "alternate" from Link header fits what we are using it for, to mean identical, duplicate copy, etc? The Link Relation Type registry's initial contents are: o Relation Name: alternate o Description: Designates a substitute for the link's context. o Reference: [W3C.REC-html401-19991224] -- (( Anthony Bryan ... Metalink [ http://www.metalinker.org ] )) Easier, More Reliable, Self Healing Downloads |
|
|
RE: Multi-server HTTPfre 2009-08-28 klockan 15:45 +0200 skrev Daniel Stenberg:
> I think all this make me favour not a wildcard concept, but more a > list-concept where a site can list not only that "this object also exist HERE > and HERE" but then also "THESE OTHER OBJECTS also exist HERE and HERE" and > "THESE OTHER" would then be a list of (relative?) URIs somehow. But this > becomes awkward if the list of items is long. I am in favor of wildcards and similar patterns. It's a one-way mapping, mapping original URL to possible mirrors, not the other way around. > Then we come to the concept of changing items. How long can a client assume > that the mirrors have the corresponding object? Would they need some kind of > cache control headers to specify that? In the mirror-for-a-single-object based on response headers the mirror better keep the object for as long as we tell the object fresh (Cache-Control: max-age etc). HTTP does not define different freshness for headers and the rest of the object, only on a response as a whole. Adding new cache-controls for these headers is pretty useless as the rest of the HTTP infrastructure (caches/proxies) will continue to use the normal HTTP freshness definitions. > In the mirror-for-a-single-object case > I think we can assume that the mirror will have the object for at least a very > short while after the response said so but then it too gets this problem. In my experience the problem in most mirror setups is the reverse, that the mirror hasn't yet got the object or more troublesome to deal with that the mirror has not yet updated of an existing object that got changed.. The first is a rather trivial error condition to deal with, no worse than when a mirror site isn't reachable (actually easier). The latter is worse and will cause a bad download unless there is reasonable means that protect from it (i.e. a common ETag and/or a hash Digest). Regards Henrik |
|
|
Re: Multi-server HTTPI don't think so; 'alternate' doesn't specify for what purpose it's an
alternate, and you need a very precise definition (byte-for-byte equivalence of representations). 'alternate' is often used to mean "here's a copy in another format" and similar. Perhaps you should mint 'duplicate'... Cheers, On 29/08/2009, at 4:01 AM, Anthony Bryan wrote: > On Fri, Aug 28, 2009 at 11:27 AM, Henrik > Nordstrom<henrik@...> wrote: >> fre 2009-08-28 klockan 12:38 +0100 skrev Ford, Alan: >> >> * Define a new Mirror profile object, similar to MetaLink but >> defining >> the mirror URL policy for groups of URLs on the server, without going >> into checksums etc (HTTP will give those). >> >> * Instance-Digest header returning the object checksum > > My connection has been down for a few days but here are my very rough > ideas on doing Metalink in HTTP headers with the Link header, Instance > Digests, and perhaps Content-MD5. > > Briefly, it's: > > Link: <http://www2.example.com/example.ext>; rel="alternate"; > Link: <ftp://ftp.example.com/example.ext>; rel="alternate"; > Link: <http://example.com/example.ext.torrent>; rel="describedby"; > type="torrent"; > Link: <http://example.com/example.ext.asc>; rel="describedby"; > type="application/pgp-signature"; > Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5= > > http://www.ietf.org/id/draft-bryan-metalinkhttp-00.txt > > I've been meaning to ask Mark Nottingham if "alternate" from Link > header fits what we are using it for, to mean identical, duplicate > copy, etc? > > The Link Relation Type registry's initial contents are: > > o Relation Name: alternate > o Description: Designates a substitute for the link's context. > o Reference: [W3C.REC-html401-19991224] > > > -- > (( Anthony Bryan ... Metalink [ http://www.metalinker.org ] > )) Easier, More Reliable, Self Healing Downloads -- Mark Nottingham http://www.mnot.net/ |
|
|
Re: Multi-server HTTPOn Mon, Aug 31, 2009 at 3:39 AM, Mark Nottingham<mnot@...> wrote:
> I don't think so; 'alternate' doesn't specify for what purpose it's an > alternate, and you need a very precise definition (byte-for-byte equivalence > of representations). 'alternate' is often used to mean "here's a copy in > another format" and similar. > > Perhaps you should mint 'duplicate'... Ok, this is what I have in the ID now: Link Relation Type Registration: "duplicate" o Relation Name: duplicate o Description: Refers to an identical resource that is a byte-for-byte equivalence of representations. o Reference: This specification. -- (( Anthony Bryan ... Metalink [ http://www.metalinker.org ] )) Easier, More Reliable, Self Healing Downloads |
|
|
Re: Multi-server HTTPThat's a good start, but it deserves a bit of discussion.
"byte-for-byte" implies that the bodes are the same, but what about things like: * Entity headers (e.g., Content-Type) * Available content-encodings * Whether partial content is supported * Whether the same set of methods are supported (e.g., if A is a duplicate of B, will POSTing something to either have the same effect as on the other?) I think the answer is that entity headers should generally be the same, so the real question is whether we're talking about the relation describing: a) resources with duplicate representations (i.e., a GET on any of the dups will return the same reps) b) duplicate resources (i.e., any method will have the same effect) If it's (b), we should consider whether the resources are in fact the same "behind the curtains" (e.g., POSTing to A has the exact same effect on the world as POSTing to B), or whether they may be in fact separate systems (i.e., A and B have the same "interface", but POSTing to A may affect a different part of the world to B). Just food for thought... On 01/09/2009, at 6:03 AM, Anthony Bryan wrote: > On Mon, Aug 31, 2009 at 3:39 AM, Mark Nottingham<mnot@...> wrote: > >> I don't think so; 'alternate' doesn't specify for what purpose it's >> an >> alternate, and you need a very precise definition (byte-for-byte >> equivalence >> of representations). 'alternate' is often used to mean "here's a >> copy in >> another format" and similar. >> >> Perhaps you should mint 'duplicate'... > > Ok, this is what I have in the ID now: > > Link Relation Type Registration: "duplicate" > > o Relation Name: duplicate > o Description: Refers to an identical resource that is a byte-for-byte > equivalence of representations. > o Reference: This specification. > > -- > (( Anthony Bryan ... Metalink [ http://www.metalinker.org ] > )) Easier, More Reliable, Self Healing Downloads -- Mark Nottingham http://www.mnot.net/ |
|
|
Re: Multi-server HTTPMark Nottingham wrote:
> That's a good start, but it deserves a bit of discussion. > > "byte-for-byte" implies that the bodes are the same, but what about > things like: > > * Entity headers (e.g., Content-Type) > * Available content-encodings > * Whether partial content is supported > * Whether the same set of methods are supported (e.g., if A is a > duplicate of B, will POSTing something to either have the same effect > as on the other?) > > I think the answer is that entity headers should generally be the > same, so the real question is whether we're talking about the relation > describing: > > a) resources with duplicate representations (i.e., a GET on any of the > dups will return the same reps) > b) duplicate resources (i.e., any method will have the same effect) > > If it's (b), we should consider whether the resources are in fact the > same "behind the curtains" (e.g., POSTing to A has the exact same > effect on the world as POSTing to B), or whether they may be in fact > separate systems (i.e., A and B have the same "interface", but POSTing > to A may affect a different part of the world to B). Well, we're talking about static GETable resources with a single representation. But I agree that if you make a Link relation, you'd want it to be applicable to as many HTTP resources as possible... Or is it possible / reasonable to say "this relation doesn't make sense for dynamic or POSTable resources and shouldn't be used for those"? |
|
|
Re: Multi-server HTTPTotally; we just need to be crisp about it.
My inclination would be that if we can be more inclusive without making it significantly more complex or risky, we should; otherwise, just do what's needed. Cheers, On 01/09/2009, at 1:49 PM, Nicolas Alvarez wrote: > Mark Nottingham wrote: >> That's a good start, but it deserves a bit of discussion. >> >> "byte-for-byte" implies that the bodes are the same, but what about >> things like: >> >> * Entity headers (e.g., Content-Type) >> * Available content-encodings >> * Whether partial content is supported >> * Whether the same set of methods are supported (e.g., if A is a >> duplicate of B, will POSTing something to either have the same effect >> as on the other?) >> >> I think the answer is that entity headers should generally be the >> same, so the real question is whether we're talking about the >> relation >> describing: >> >> a) resources with duplicate representations (i.e., a GET on any of >> the >> dups will return the same reps) >> b) duplicate resources (i.e., any method will have the same effect) >> >> If it's (b), we should consider whether the resources are in fact the >> same "behind the curtains" (e.g., POSTing to A has the exact same >> effect on the world as POSTing to B), or whether they may be in fact >> separate systems (i.e., A and B have the same "interface", but >> POSTing >> to A may affect a different part of the world to B). > > Well, we're talking about static GETable resources with a single > representation. But I agree that if you make a Link relation, you'd > want it > to be applicable to as many HTTP resources as possible... Or is it > possible / reasonable to say "this relation doesn't make sense for > dynamic > or POSTable resources and shouldn't be used for those"? > > > -- Mark Nottingham http://www.mnot.net/ |
|
|
Re: Multi-server HTTPHere's what I have now. More inclusive is good but I think someone
else would be better at writing it than me. http://tools.ietf.org/html/draft-bryan-metalinkhttp Link Relation Type Registration: "duplicate" o Relation Name: duplicate o Description: Refers to an identical resource that is a byte-for-byte equivalence of representations. o Reference: This specification. o Notes: This relation is for static resources. That is, an HTTP GET request on any duplicate will return the same representation. It does not make sense for dynamic or POSTable resources and should not be used for them. And here's the introduction (Content-MD5 is now mentioned): MetaLinkHeader is an alternative to Metalink, usually represented in an XML-based document format [draft-bryan-metalink]. MetaLinkHeader attempts to provide as much functionality as the Metalink XML format by using existing standards such as Web Linking [draft-nottingham-http-link-header], Instance Digests in HTTP [RFC3230], and Content-MD5 [RFC1864]. MetaLinkHeader is used to list information about a file to be downloaded. This includes lists of multiple URIs (mirrors), Peer-to-Peer information, checksums, and digital signatures. Here's what it looks like: Link: <http://www2.example.com/example.ext>; rel="duplicate"; Link: <ftp://ftp.example.com/example.ext>; rel="duplicate"; Link: <http://example.com/example.ext.torrent>; rel="describedby"; type="torrent"; Link: <http://example.com/example.ext.asc>; rel="describedby"; type="application/pgp-signature"; Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5= And more description: Metalink servers are HTTP servers that MUST have lists of mirrors and use the Link header [draft-nottingham-http-link-header] to indicate them. They also MUST provide checksums of files via Instance Digests in HTTP [RFC3230]. Mirror and checksum information provided by the originating Metalink server is considered authoritative. Mirror servers are typically FTP or HTTP servers that "mirror" another server. That is, they provide identical copies of (at least some) files that are also on the mirrored server. Mirror servers MAY be Metalink servers. Mirror servers MUST support serving partial content. Mirror servers SHOULD support Instance Digests in HTTP [RFC3230]. Metalink clients use the mirrors provided by a Metalink server with Link header [draft-nottingham-http-link-header]. Metalink clients MUST support HTTP and MAY support FTP, BitTorrent, or other download methods. Metalink clients MUST switch downloads from one mirror to another if the one mirror becomes unreachable. Metalink clients are RECOMMENDED to support multi-source, or parallel, downloads, where chunks of a file are downloaded from multiple mirrors simultaneously (and optionally, Peer-to-Peer). Metalink clients MUST support Instance Digests in HTTP [RFC3230] by requesting and verifying checksums. Metalink clients MAY make use of digital signatures if they are offered. On Tue, Sep 1, 2009 at 3:08 AM, Mark Nottingham<mnot@...> wrote: > Totally; we just need to be crisp about it. > > My inclination would be that if we can be more inclusive without making it > significantly more complex or risky, we should; otherwise, just do what's > needed. > > Cheers, > > > On 01/09/2009, at 1:49 PM, Nicolas Alvarez wrote: > >> Mark Nottingham wrote: >>> >>> That's a good start, but it deserves a bit of discussion. >>> >>> "byte-for-byte" implies that the bodes are the same, but what about >>> things like: >>> >>> * Entity headers (e.g., Content-Type) >>> * Available content-encodings >>> * Whether partial content is supported >>> * Whether the same set of methods are supported (e.g., if A is a >>> duplicate of B, will POSTing something to either have the same effect >>> as on the other?) >>> >>> I think the answer is that entity headers should generally be the >>> same, so the real question is whether we're talking about the relation >>> describing: >>> >>> a) resources with duplicate representations (i.e., a GET on any of the >>> dups will return the same reps) >>> b) duplicate resources (i.e., any method will have the same effect) >>> >>> If it's (b), we should consider whether the resources are in fact the >>> same "behind the curtains" (e.g., POSTing to A has the exact same >>> effect on the world as POSTing to B), or whether they may be in fact >>> separate systems (i.e., A and B have the same "interface", but POSTing >>> to A may affect a different part of the world to B). >> >> Well, we're talking about static GETable resources with a single >> representation. But I agree that if you make a Link relation, you'd want >> it >> to be applicable to as many HTTP resources as possible... Or is it >> possible / reasonable to say "this relation doesn't make sense for dynamic >> or POSTable resources and shouldn't be used for those"? -- (( Anthony Bryan ... Metalink [ http://www.metalinker.org ] )) Easier, More Reliable, Self Healing Downloads |
|
|
Re: Multi-server HTTPOn 08/09/2009, at 11:19 AM, Anthony Bryan wrote: > Here's what I have now. More inclusive is good but I think someone > else would be better at writing it than me. > > http://tools.ietf.org/html/draft-bryan-metalinkhttp > > Link Relation Type Registration: "duplicate" > > o Relation Name: duplicate > o Description: Refers to an identical resource that is a > byte-for-byte equivalence of representations. Does this imply that each resource has exactly the same set of representations, or that when two resources share representations, those representations are duplicates? > o Reference: This specification. > o Notes: This relation is for static resources. That is, an HTTP > GET request on any duplicate will return the same representation. It > does not make sense for dynamic or POSTable resources and should not > be used for them. > > And here's the introduction (Content-MD5 is now mentioned): > > MetaLinkHeader is an alternative to Metalink, usually represented in > an XML-based document format [draft-bryan-metalink]. MetaLinkHeader > attempts to provide as much functionality as the Metalink XML format > by using existing standards such as Web Linking > [draft-nottingham-http-link-header], Instance Digests in HTTP > [RFC3230], and Content-MD5 [RFC1864]. MetaLinkHeader is used to > list > information about a file to be downloaded. This includes lists of > multiple URIs (mirrors), Peer-to-Peer information, checksums, and > digital signatures. > > Here's what it looks like: > > Link: <http://www2.example.com/example.ext>; rel="duplicate"; > Link: <ftp://ftp.example.com/example.ext>; rel="duplicate"; > Link: <http://example.com/example.ext.torrent>; rel="describedby"; > type="torrent"; Do torrents have media types yet? > Link: <http://example.com/example.ext.asc>; rel="describedby"; > type="application/pgp-signature"; > Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5= > > And more description: > > Metalink servers are HTTP servers that MUST have lists of mirrors > and > use the Link header [draft-nottingham-http-link-header] to indicate > them. They also MUST provide checksums of files via Instance > Digests > in HTTP [RFC3230]. Mirror and checksum information provided by the > originating Metalink server is considered authoritative. > > Mirror servers are typically FTP or HTTP servers that "mirror" > another server. That is, they provide identical copies of (at least > some) files that are also on the mirrored server. Mirror servers > MAY > be Metalink servers. Mirror servers MUST support serving partial > content. Mirror servers SHOULD support Instance Digests in HTTP > [RFC3230]. > > Metalink clients use the mirrors provided by a Metalink server with > Link header [draft-nottingham-http-link-header]. Metalink clients > MUST support HTTP and MAY support FTP, BitTorrent, or other download > methods. Metalink clients MUST switch downloads from one mirror to > another if the one mirror becomes unreachable. Metalink clients are > RECOMMENDED to support multi-source, or parallel, downloads, where > chunks of a file are downloaded from multiple mirrors simultaneously > (and optionally, Peer-to-Peer). Metalink clients MUST support > Instance Digests in HTTP [RFC3230] by requesting and verifying > checksums. Metalink clients MAY make use of digital signatures if > they are offered. > > > > On Tue, Sep 1, 2009 at 3:08 AM, Mark Nottingham<mnot@...> wrote: >> Totally; we just need to be crisp about it. >> >> My inclination would be that if we can be more inclusive without >> making it >> significantly more complex or risky, we should; otherwise, just do >> what's >> needed. >> >> Cheers, >> >> >> On 01/09/2009, at 1:49 PM, Nicolas Alvarez wrote: >> >>> Mark Nottingham wrote: >>>> >>>> That's a good start, but it deserves a bit of discussion. >>>> >>>> "byte-for-byte" implies that the bodes are the same, but what about >>>> things like: >>>> >>>> * Entity headers (e.g., Content-Type) >>>> * Available content-encodings >>>> * Whether partial content is supported >>>> * Whether the same set of methods are supported (e.g., if A is a >>>> duplicate of B, will POSTing something to either have the same >>>> effect >>>> as on the other?) >>>> >>>> I think the answer is that entity headers should generally be the >>>> same, so the real question is whether we're talking about the >>>> relation >>>> describing: >>>> >>>> a) resources with duplicate representations (i.e., a GET on any >>>> of the >>>> dups will return the same reps) >>>> b) duplicate resources (i.e., any method will have the same effect) >>>> >>>> If it's (b), we should consider whether the resources are in fact >>>> the >>>> same "behind the curtains" (e.g., POSTing to A has the exact same >>>> effect on the world as POSTing to B), or whether they may be in >>>> fact >>>> separate systems (i.e., A and B have the same "interface", but >>>> POSTing >>>> to A may affect a different part of the world to B). >>> >>> Well, we're talking about static GETable resources with a single >>> representation. But I agree that if you make a Link relation, >>> you'd want >>> it >>> to be applicable to as many HTTP resources as possible... Or is it >>> possible / reasonable to say "this relation doesn't make sense for >>> dynamic >>> or POSTable resources and shouldn't be used for those"? > > > > -- > (( Anthony Bryan ... Metalink [ http://www.metalinker.org ] > )) Easier, More Reliable, Self Healing Downloads -- Mark Nottingham http://www.mnot.net/ |
|
|
Re: Multi-server HTTPOn Mon, Sep 14, 2009 at 11:06 PM, Mark Nottingham <mnot@...> wrote:
> > On 08/09/2009, at 11:19 AM, Anthony Bryan wrote: > >> Here's what I have now. More inclusive is good but I think someone >> else would be better at writing it than me. >> >> http://tools.ietf.org/html/draft-bryan-metalinkhttp >> >> Link Relation Type Registration: "duplicate" >> >> o Relation Name: duplicate >> o Description: Refers to an identical resource that is a >> byte-for-byte equivalence of representations. > > Does this imply that each resource has exactly the same set of > representations, or that when two resources share representations, those > representations are duplicates? The latter. Any suggestions for replacement text? Because what I have isn't cutting it. >> Here's what it looks like: >> >> Link: <http://www2.example.com/example.ext>; rel="duplicate"; >> Link: <ftp://ftp.example.com/example.ext>; rel="duplicate"; >> Link: <http://example.com/example.ext.torrent>; rel="describedby"; >> type="torrent"; > > Do torrents have media types yet? Not as far as I know. Which is also why in draft-bryan-metalink we have this: 4.2.10.2. The "type" Attribute metalink:metaurl elements MUST have a "type" attribute that indicates the MIME type of the metadata available at the IRI. In the case of BitTorrent as specified in [BITTORRENT], the value "torrent" is required. Types without "/" are reserved. Currently, "torrent" is the only reserved value. -- (( Anthony Bryan ... Metalink [ http://www.metalinker.org ] )) Easier, More Reliable, Self Healing Downloads |
|
|
Re: Multi-server HTTPOn 15/09/2009, at 1:59 PM, Anthony Bryan wrote: > On Mon, Sep 14, 2009 at 11:06 PM, Mark Nottingham <mnot@...> > wrote: >> >> On 08/09/2009, at 11:19 AM, Anthony Bryan wrote: >> >>> Here's what I have now. More inclusive is good but I think someone >>> else would be better at writing it than me. >>> >>> http://tools.ietf.org/html/draft-bryan-metalinkhttp >>> >>> Link Relation Type Registration: "duplicate" >>> >>> o Relation Name: duplicate >>> o Description: Refers to an identical resource that is a >>> byte-for-byte equivalence of representations. >> >> Does this imply that each resource has exactly the same set of >> representations, or that when two resources share representations, >> those >> representations are duplicates? > > The latter. > > Any suggestions for replacement text? Because what I have isn't > cutting it. Hm. Refers to a resource whose available representations are byte-for-byte identical with the corresponding representations of the context IRI. >>> Here's what it looks like: >>> >>> Link: <http://www2.example.com/example.ext>; rel="duplicate"; >>> Link: <ftp://ftp.example.com/example.ext>; rel="duplicate"; >>> Link: <http://example.com/example.ext.torrent>; rel="describedby"; >>> type="torrent"; >> >> Do torrents have media types yet? > > Not as far as I know. > > Which is also why in draft-bryan-metalink we have this: > > 4.2.10.2. The "type" Attribute > > metalink:metaurl elements MUST have a "type" attribute that > indicates > the MIME type of the metadata available at the IRI. In the case of > BitTorrent as specified in [BITTORRENT], the value "torrent" is > required. Types without "/" are reserved. Currently, "torrent" is > the only reserved value. Overloading type like that is bad; register a media type (or get the appropriate people to do it). Cheers, -- Mark Nottingham http://www.mnot.net/ |
| < Prev | 1 - 2 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |