Re: Request for feedback: HTTP-based Resource Descriptor Discovery

View: New views
5 Messages — Rating Filter:   Alert me  

Parent Message unknown Re: Request for feedback: HTTP-based Resource Descriptor Discovery

by Eran Hammer-Lahav :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Thanks for the feedback. It is extremely useful. Please note that I have already published a -01 revision last week which addressed some of these concerns.

See my comments below.

On 1/29/09 6:56 AM, "Jonathan Rees" <jar@...> wrote:
> - Please do not say 'resource discovery' as this protocol is not about
>    discovering resources.  You have many alternatives that do not say
>    something that's confusing: 'descriptor resource discovery',
>    'description discovery', 'resource description discovery', etc.

This was already changed in -01 to 'descriptor discovery'.

> - I really wish we could say something stronger about the format of
>    the DR.  May I suggest that the DR be required to possess at least
>    one 'representation' that is either RDF/XML or convertible to
>    RDF/XML using GRDDL?

It is the job of the descriptor to be useful, not the discovery spec to suggest that...

> - I anticipate some confusion as to whether the link relates the
>    resource to the DR (as in the POWDER 'describedby' definition you
>    quote), the URI to the DR, or the URI to the DR's URI (as in the
>    second sentence of section 6).  In RDF, <resource> describedby <dr>
>    is most natural to write, but RDF semantics rules out the
>    possibility that this might say anything specific to a particular
>    URI naming the resource[*].  This protocol is an opportunity for the
>    URI owner to say things not only about the resource but about the
>    URI/resource binding itself, such as its authority, provenance, and
>    stability, and that will vary with URI, not resource, as each URI
>    might have a different "owner".

Much of this debate is depends heavily on two questions:

- Are we discovering a URI Descriptor or Resource Descriptor?
- Is this protocol part of the network layer or the application layer?

I don't have full answers but I am attempting as much as possible to create a Resource Descriptor discovery protocol, and I find positioning it closer to the application layer much easier to implement (since it can work on a very narrow set of network layer features).

The relationship between the URI used to the resource being discovered can be simply described as 'what we've got'. I am not sure how to say anything more useful in the spec.

> - The POWDER documentation gives a different URI for the describedby
>    relation than the one that you'd get by using the proposed
>    IANA-based relation registry.  It would be unfortunate if there
>    continued to be two URIs for the same thing, and you should work
>    with POWDER to settle on one or the other.  I would not make use use
>    of the link relation registry a requirement.

'rel' types across all methods will depend directly on the proposed registry defined by draft-nottingham-http-link-header. POWDER (per Phil Archer) will be properly registered within this proposed IANA registry. Whatever draft-nottingham-http-link-header consider equivalent to the short name 'describedby' is acceptable for this.

> - Editorial comment: On first reading I found the first set of bullets
>    in section 7 to be very mysterious.  They make no sense at all until
>    you've read the following text.  I suggest that (a) you list the
>    three methods before launching into the factors that go into
>    deciding between them; and (b) that the four bullets be more
>    specific - e.g. instead of saying it depends on document type (media
>    type), say that it depends on whether the resource has a
>    representation supporting the <link> element, and rather than saying
>    it depends on URI scheme, say that it depends on whether the scheme
>    is http(s) or something else.

Yep. I'm looking for ways to move parts of this to the introduction and others turn into actionable items.

> - Bullet "HTTP Link header": "Limited to resources with an accessible
>    representation using the HTTP protocol [RFC2616], or..." -- while
>    you're not saying anything wrong here, I don't see what purpose the
>    part before the "or" serves, and I find it distracting.  I think you
>    should simply say:
>        "Limited to resources for
>        which an HTTP GET or HEAD request returns a non-5xx
>        HTTP response [RFC2616]."

This sounds reasonable.

>    The exact limitation you want to put on HTTP (2xx, 2xx+3xx,
>    2xx+3xx+4xx, or any) is debatable.  I think 3xx responses have to be
>    OK (see below), 4xx responses should be, and 5xx responses could be
>    although I don't think I would trust them.
>
>    If all HTTP responses can carry believable Link: headers, matters
>    are greatly simplified because you can just say that you can always
>    try the HTTP method - it is not limited in any way.

The difficulty is to align the spec with existing expectations and to make sure that it is always predictable. I am also trying to align the result codes with the common semantic expectation of what constitutes a valid representation for the resource identified with the URI being dereference.

Before I dive into a deep review of all the possible HTTP response code I'd like to ask a simple question. What actual use cases break and what inefficiencies are created from a strict limitation of the allowed response codes (200, 301, 302, 303, 307, 401)?

Discovery has to be non-intrusive (at least in its general purpose elements) which seems to limit us to only GET and HEAD. There is nothing stopping an application from using a normative reference to this spec and then extending the allowed set of methods and response code if it adds value to their use cases, but I can't come up with scenarios where this restriction actual breaks stuff (that a follow up HEAD can't solve).

With regard to the permitted HTTP response codes, I am having a hard time simply allowing whole sets (2xx, 3xx) because each one has codes that are unacceptable for this purpose.

1xx and 5xx are obviously out.

In the 2xx range:

* 200 OK - obviously useful.
* 201 Created - doesn't fit with the passive nature of the protocol (or GET/HEAD).
* 202 Accepted - implies something other than synchronous information retrieval. Not sure how can a generic discovery library handle this, and what it means in a reply to a GET/HEAD with Link headers present.
* 203 Non-Authoritative Information - I can see this being used, but should the spec call out the potential issues with trusting such information?
* 204 No Content - seems useful as it provide an updated metainformation view, but present the issue of incomplete information.
* 205 Reset Content - no idea.
* 206 Partial Content - useful.

Given the above concerns, is it still appropriate for the spec to simply state that a 2xx response is valid? It is after all the responsibility of the application to implement HTTP correctly, which means it should be aware that each 2xx response has its own semantics. I'm ok with replacing all 200 in the spec with 2xx.

The 3xx range is harder to generalize because of existing expectations as to their semantic meaning. The problem, of course, is cause by the way this entire discovery protocol is defined. If this was a URI Descriptor Discovery protocol, a 3xx response would not be followed for the purpose of obtaining a descriptor. Instead, the 3xx response header Links will be used and the Location header ignored.

Since this is trying to be a Resource Descriptor Discovery, where the resource URI is simply the first cookie crumb, the effort to obtain the Link headers must follow the same rules as the effort to obtain a valid representation of the resource (which does not stop at the first 3xx response).

What I know is that we can't have both. It should not be a matter of opinion as to which Link header ends up being found or used for the purpose of descriptor discovery. I am afraid of trying to define this in a generic way because there is too much confusion already with regard to what exactly should applications do with each 3xx code.

For each 3xx code, this is how I believe the discovery of Link headers should be performed:

* 300 Multiple Choices - Link headers on the 300 response must be ignored. How to pick the desired representation is out of scope, but one has to be selected and retrieved (rinse and repeat until a 2xx code is received) and its Links used.
* 301 Moved Permanently - Repeat the process using the URI found in the Location header. Link headers on the 301 response must be ignored.
* 302 Found - same as 301.
* 303 See Other - the 303 Link headers are used and the URI found in the Location header is not used for discovery since the Location header points to a different resource.
* 304 Not Modified - does not seem to contain any relevant information, and I'm not sure what to do with any Link headers it may contain.
* 305 Use Proxy - same as 301 but following proxy rules.
* 307 Temporary Redirect - same as 301.

Even if we agree that 304 is not applicable to discovery, we still have a conflicting resolution between 303 and the rest of the response codes. I am open to expending the allowed range, but will still need to be explicit about the difference between 303 and the rest.

The 4xx range is easier to deal with because for the most part, from a discovery pov, it is not about the resource but about the request. It represent a hurdle for the client to resolve in order to move passed it and obtain a representation.

Without considering any real-world use cases, it is easy to simply dismiss all 4xx responses and declare that the Link header method has failed (other methods should be attempted such as <Link> element or Site-meta). But at least one response code can greatly benefit from this discovery protocol: 401. In the context of a 401, Link headers can offer valuable information about how to get passed it. Some people seem to suggest that a 404 can be used in a similar semantic fashion as a 303, but I rather stay out of that debate.

My assumption is that a 4xx response is not a valid representation of the resource and therefore cannot include Link headers relevant for finding the location of the resource descriptor. It is however, a valid representation of the resource under very specific conditions, such as its access restrictions.

Even for the 401 use case, it is trivial to move discovery needs to the WWW-Authenticate response header. If the descriptor is directly related to getting past the 401 road block, it is probably more appropriate to let the security challenge define its own discovery mechanism rather than try and generalize it here.

I'm inclined after writing this to remove all 4xx codes from the supported set, including 401. The rules I am following is: no representation, no descriptor.

---

Proposed resolution: allow 2xx, 3xx with different handling of 303 vs all others, leave 4xx undefined, and forbid 1xx and 5xx. Allowing the entire 2xx range will put the burden on the client to follow basic HTTP rules (and know what is not reasonable to expect in a reply to a GET/HEAD request).

> - In TAG discussion the question arose as to why all three methods had
>    to produce the same descriptor resource location.

The language in -02 will be: "If more than one method is supported, all methods MUST produce the same set of resource descriptors." I have taken the more liberal approach.

> - Anywhere you mention 301 and 302 you should also add 307.

Yes. I will also make it clear that redirects should be obeyed when retrieving the HTML or ATOM representation in the <LINK> element method.

> - The algorithm in 8.2 is one I strongly object to, as it does not permit
>    Link: on 30x responses, which IMO is a central Semantic Web use case.
>    Consider, for example, a "value added" URI for a document where a
>    301 response provides a Link: to useful metadata, and redirects to
>    the actual document.

See previous discussion. As you clearly demonstrated, it is hard to make generic statements about whole classes of responses (i.e. 3xx, 4xx). You also raise the questions of what it is the descriptor is about, the resource or the URI. My issue with your approach is that it isn't really an interop spec, but a best practice guide. All I care about is interop even at the cost of eliminating potentially useful use cases. Note that your handling of 301 above is self contradictory.

> - Your proposal to specify URI-to-DR-URI rewrites as
>    template="prefix{uri}suffix" is a good start, but I think that the
>    additional ability to specify match conditions on the input URI will
>    end up being important.  In one project I work on we're already
>    using the rule

Please review the current text [1] and let me know if it addresses all your use cases. I am well aware that it is incorrect in handling mailto URIs since they do not have an authority component (a mistake corrected in -02).

> - We need to be careful about quoting.  If a DR is meant to be found
>    via a CGI script invoked via a query URI (the link-template prefix
>    has a ? in it), and the original URI already contains significant
>    CGI characters like &, then an application could get into big
>    trouble.  This needs to be either handled directly somehow (I can't
>    imagine how), or left as a combination of a big scary disclaimer and
>    a security warning.

Can you provide examples?

> - I think you need to warn that this protocol should only be applied
>    to URIs not containing a fragment id.  If you allow fragment ids
>    you're going to get into serious problems with both quoting and
>    semantics.

I am not sure what to do here. Should the fragment be removed from the definition of 'uri' in the template vocabulary? That seems like the easiest solution (allowing it to be used explicitly with the 'fragment' variable).

> [*] Footnote (not relevant unless you care about how RDF might
> interact with this discovery protocol): Suppose U1 and U2 both name
> (denote, identify, refer to, are interpreted to be, etc.) some
> resource R

Where is that established (that both refer to the same resource R)?

> and suppose that
>
>     <U1> describedby <DR1>.
>     <U2> describedby <DR2>.
>
> Then necessarily
>
>     <U1> describedby <DR2>.
>     <U2> describedby <DR1>.

Not without some other external information. We just had a couple hours of debate on a similar topic at the XRI TC, namely if multiple resources (via their representations) can point to the same descriptor URI, and if doing so implies any kind of relationship between them. We decided that it is allowed, but it does not imply any relationship between the resources pointing to the same descriptor URI.

In other words:

R1-URI --> RD-URI
R2-URI --> RD-URI

Means exactly the same as:

R1-URI --> RD1-URI
R2-URI --> RD2-URI

When the content of RD1 and RD2 is identical.

> - Under <link> element (section 7), please include XHTML along with
> HTML (this came up on a TAG telecon).

Ok.

> - I understand that we desire to stay away from a rigorous treatment
> of authentication, authority, and authorization, leaving that up
> either to risk acceptance or an orthogonal security infrastructure.
> However, we need to specify what the protocol's position is on
> attribution, in the situation where communication *is* secure and/or
> risks are accepted.

Why? Isn't this the role of the descriptor? This might be true for Link headers in general, but as used in this protocol, the only statement allowed is where to find 'information about'. No other conclusion is defined from the presence of descriptor location.

> <link> has problems in this regard that Link: and site-meta don't.
> Although in the normal case a document speaks for the owner of the URI
> that names it, there are important cases where this doesn't hold. One
> is where the resource is obsolete, so that what it said before is no
> longer true. This is not just a mistake to be fixed as faithfully
> retaining unmodified old versions is often very important.

Not really. While atomic operations are not really practical with regard to obtaining a snapshot in time of potentially all three methods for a single resource, in theory, if one is to archive a representation in which the <LINK> element is to be useful, it must also archive the other potential method outputs. How outdated data is used is out of scope.

>  From a communication point of view, <link> is the best of the three
> methods to link to a DR since there is the least risk that it will get
> detached from the representation.

>From an implementation pov, <LINK> in non XML documents is the least desirable method since parsing HTML is notoriously awful. It was suggested that the spec required at least one other method than HTML <LINK>. I am seriously considering it (but doing so will violate the principals declared in the analysis appendix).

> But your memo does talk about authority (here I
> think we mean what statements can be put in the mouths of what
> principals) as if it's a question it cares about. I think the problem
> of whether <link> speaks for the URI owner ought to be addressed
> somehow.

In -02 I am doing my best to remove any mention of 'authority' other than in relation to 3986.

EHL

[1] http://tools.ietf.org/html/draft-hammer-discovery-01#section-8.3.2.1




Re: Request for feedback: HTTP-based Resource Descriptor Discovery

by Jonathan Rees-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Let's work out this redirection case, since nothing else matters if we  
can't agree on this. I'll get back to your other questions later.

The problem with your treatment of redirects is that the protocol can  
give the wrong answer.

The situation is that we do a GET/HEAD of a URI U, and receive a  
301/302/307 specifying Location: V. Your protocol is supposed to get a  
description resource for the resource "identified" (RFC 3986) by U,  
yet you will throw away a DR in the response to GET/HEAD U (one that  
is explicitly said to be a DR of U) and look for one in the response  
to GET/HEAD V instead. What makes you think that V names the same  
resource as U? If it doesn't, V's DR has no bearing on the resource  
named by U. Even if you assume they do name the same resource (which  
you can't in the 307 case), why would you have any reason to prefer  
the V DR to the U DR? The ability to serve a resource's  
representations does not necessarily make you better qualified than  
anyone else to describe it.

You may want to say: Well, the U and V resources have the same  
representations (GET behavior), so doesn't that mean they're the same  
resource?  I don't think it follows. In particular there are other  
methods to consider, such as POST. As far as I know all GETs can be  
the same and the resources can still be different.

The only theory I know of for deciding which resource is supposed to  
be named by a URI is that articulated in the W3C web architecture  
recommendation [1]. This says that it is up to a party known as the  
URI's "owner" to bind the URI to some resource. So if you want to  
learn about a named resource, it is up to the URI owner to determine  
what resource it is you want to learn about. Why should you talk to  
anyone else, if the owner is willing to speak (via Link:)?

I think it is practical and reasonable that *if* U's owner provides no  
DR, then we can risk taking a 301 redirect (and maybe 302) to mean  
that V names the same resource, so that V's DR, if any, describes that  
resource. But an explicit Link: on a redirect has to mean that the URI  
owner, who is an "authority", is trying to say something important to  
you about the resource, such as the ways in which it differs from the  
redirect target.

Even if U and V are assumed to name the same resource, or resources  
that cannot be distinguished, it is very easy to come up with cases  
where either DR is vastly preferable to the other; differences in  
credibility due to deception, reliability, competence, and timeliness  
can go either way. If you ask a librarian, they will say that the  
original publisher (V) is rarely to be trusted to provide good  
metadata, and one should consult a competent metadata service to  
obtain such (U). (This is a real use case.)

There is a practical reason to prefer the U DR: it can be obtained in  
one roundtrip, while getting the V DR takes two.

I also wonder how the redirect case is any different from that of a  
proxy server that adds a Link: header. If you could detect that the  
proxy server added it, and not the origin (you can't), would you throw  
away the proxy server specified DR, even when the origin provided none?

-Jonathan

[1] http://www.w3.org/TR/webarch/#uri-assignment



RE: Request for feedback: HTTP-based Resource Descriptor Discovery

by Eran Hammer-Lahav :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


What we want is a resource descriptor, not URI descriptor. It is clear that a URI descriptor discovery must not allow any secondary requests. Whatever you find after a single GET/HEAD of the dereferenced URI is what you are going to use.

The answer seems to be that the descriptor location is obtained from whatever the client considers a valid representation of the resource. From recent discussions, there seems to be consensus that the Link header is between two resources (not representations). Link headers (due to the nature of HTTP) are attached to a representation, but their subject is the resource itself. <LINK> elements have similar semantics.

Therefore, the discovery spec, instead of providing a single workflow (i.e. follow redirects, look for 200 or 303, etc.) needs to pass the decision of which Link headers to use to the client. This can be even more complex if a 301 header includes Links and the 200 header (followed from the 301) does not, but offers an HTML representation with <LINK> elements.

If you consider your example below, which Link header to use (the one attached to the 301 response or the 200 obtained by following the 301 redirect), the answer is the Link header attached to the representation of the resource the client is interested in. It is perfectly valid for different representations to include different Links (as long as the Links are not representation specific, just more applicable).

For example, descriptor discovery of web pages intended for consumption using a browser will usually ignore Link headers on the 301 and fetch those on the 200. Why? Because that is the resource they are actually interested in. The always follow redirects blindly, and the intermediate URIs are ignored and hidden from the end user.

In other words. If you have a URI U which redirects you to URI V, the decision which RD to use (DR U or RD V) is completely tied to which representation is more relevant to your inquiry.

This was somewhat hidden in the spec with regard to <LINK> element because it ignores how the client got from the resource URI to the HTML document. But it suffers from the same ambiguity.

The problem, of course, is find a way to define it in an interoperable way.

EHL



> -----Original Message-----
> From: Jonathan Rees [mailto:jar@...]
> Sent: Saturday, January 31, 2009 8:55 PM
> To: Eran Hammer-Lahav
> Cc: www-tag@...; Phil Archer; Mark Nottingham; www-talk@...
> Subject: Re: Request for feedback: HTTP-based Resource Descriptor
> Discovery
>
> Let's work out this redirection case, since nothing else matters if we
> can't agree on this. I'll get back to your other questions later.
>
> The problem with your treatment of redirects is that the protocol can
> give the wrong answer.
>
> The situation is that we do a GET/HEAD of a URI U, and receive a
> 301/302/307 specifying Location: V. Your protocol is supposed to get a
> description resource for the resource "identified" (RFC 3986) by U,
> yet you will throw away a DR in the response to GET/HEAD U (one that
> is explicitly said to be a DR of U) and look for one in the response
> to GET/HEAD V instead. What makes you think that V names the same
> resource as U? If it doesn't, V's DR has no bearing on the resource
> named by U. Even if you assume they do name the same resource (which
> you can't in the 307 case), why would you have any reason to prefer
> the V DR to the U DR? The ability to serve a resource's
> representations does not necessarily make you better qualified than
> anyone else to describe it.
>
> You may want to say: Well, the U and V resources have the same
> representations (GET behavior), so doesn't that mean they're the same
> resource?  I don't think it follows. In particular there are other
> methods to consider, such as POST. As far as I know all GETs can be
> the same and the resources can still be different.
>
> The only theory I know of for deciding which resource is supposed to
> be named by a URI is that articulated in the W3C web architecture
> recommendation [1]. This says that it is up to a party known as the
> URI's "owner" to bind the URI to some resource. So if you want to
> learn about a named resource, it is up to the URI owner to determine
> what resource it is you want to learn about. Why should you talk to
> anyone else, if the owner is willing to speak (via Link:)?
>
> I think it is practical and reasonable that *if* U's owner provides no
> DR, then we can risk taking a 301 redirect (and maybe 302) to mean
> that V names the same resource, so that V's DR, if any, describes that
> resource. But an explicit Link: on a redirect has to mean that the URI
> owner, who is an "authority", is trying to say something important to
> you about the resource, such as the ways in which it differs from the
> redirect target.
>
> Even if U and V are assumed to name the same resource, or resources
> that cannot be distinguished, it is very easy to come up with cases
> where either DR is vastly preferable to the other; differences in
> credibility due to deception, reliability, competence, and timeliness
> can go either way. If you ask a librarian, they will say that the
> original publisher (V) is rarely to be trusted to provide good
> metadata, and one should consult a competent metadata service to
> obtain such (U). (This is a real use case.)
>
> There is a practical reason to prefer the U DR: it can be obtained in
> one roundtrip, while getting the V DR takes two.
>
> I also wonder how the redirect case is any different from that of a
> proxy server that adds a Link: header. If you could detect that the
> proxy server added it, and not the origin (you can't), would you throw
> away the proxy server specified DR, even when the origin provided none?
>
> -Jonathan
>
> [1] http://www.w3.org/TR/webarch/#uri-assignment



Re: Request for feedback: HTTP-based Resource Descriptor Discovery

by Phil Archer-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Eran, Jonathan,

I've been catching up with this thread this morning. The discussion
around redirects shows that it's not a simple matter. In POWDER we
basically chickened out and said that how redirects are handled is
application-specific. We get away with this because what a POWDER DR
describes is defined within the POWDER doc, and who published it is
always explicit. Therefore whether you follow an Link: on a 301 response
is irrelevant, the data you eventually get, however you get it, is
self-contained.

However, the particular line that caught my eye in this thread was:

[..]

[JR]
>> - I think you need to warn that this protocol should only be applied
>>    to URIs not containing a fragment id.  If you allow fragment ids
>>    you're going to get into serious problems with both quoting and
>>    semantics.
[EHL]
> I am not sure what to do here. Should the fragment be removed from the definition of 'uri' in the template vocabulary? That seems like the easiest solution (allowing it to be used explicitly with the 'fragment' variable).
>
>  
OK this would be a big problem for POWDER where we make specific use of
fragment IDs. The basis of POWDER is that you apply descriptors to
things defined by string or reg ex matches against a URI (everything on
example.com, all its subdomains etc.). But content management systems
don't always arrange things nicely so that the pattern matching can
work. Our big use case (and WG member) Deutsche Telekom (t-online.de)
being a case in point (at least for the time being they use numeric URIs
with no discernible pattern).

In those situations we need to link from a resource directly to its DR
directly which we do using fragment IDs. So you create a POWDER file,
complete with attribution and a restriction on its applicability to,
say, t-online.de, and then put an xml:id on a actual descriptor set. A
resource can then link to that descriptor set with

<link rel="describedby" href="/powder.xml#red"
type="application/powder+xml" />

(or its HTP equivalent). Note the #red frag. It's not as powerful as the
primary POWDER method of resource grouping but it is something we need
to support so please, please don't say that a describedby link can't
have a fragment ID!

Chapter and verse on this is at [1]

Phil.

[1]
http://www.w3.org/2007/powder/Group/powder-dr/20090120-diff.html#directDescript 



RE: Request for feedback: HTTP-based Resource Descriptor Discovery

by Eran Hammer-Lahav :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


My understanding of the point Jonathan made is that it was specifically targeting the template vocabulary. For example, consider the following resource URI and template:

Resource: http://example.com/resource/1#body
Template: {uri};about

The question is, does {uri} includes the #body fragment or not. If it does, the template produces:

http://example.com/resource/1#body;about

which is wrong as it places the suffix in the fragment. My solution in the upcoming draft (-02) is to exclude the fragment and '#' from the {uri} variable. So the above example will produce:

http://example.com/resource/1;about

and if you want to retain the fragment:

{uri};about#{fragment}

The other option is to define two sets of {uri} variables, but that seems too messy and is not called for in most use cases.

But none of this implies anything with regards to using fragments in links...

EHL


> -----Original Message-----
> From: Phil Archer [mailto:phil@...]
> Sent: Wednesday, February 04, 2009 1:42 AM
> To: Eran Hammer-Lahav
> Cc: Jonathan Rees; www-tag@...; Mark Nottingham; www-talk@...
> Subject: Re: Request for feedback: HTTP-based Resource Descriptor
> Discovery
>
> Eran, Jonathan,
>
> I've been catching up with this thread this morning. The discussion
> around redirects shows that it's not a simple matter. In POWDER we
> basically chickened out and said that how redirects are handled is
> application-specific. We get away with this because what a POWDER DR
> describes is defined within the POWDER doc, and who published it is
> always explicit. Therefore whether you follow an Link: on a 301
> response
> is irrelevant, the data you eventually get, however you get it, is
> self-contained.
>
> However, the particular line that caught my eye in this thread was:
>
> [..]
>
> [JR]
> >> - I think you need to warn that this protocol should only be applied
> >>    to URIs not containing a fragment id.  If you allow fragment ids
> >>    you're going to get into serious problems with both quoting and
> >>    semantics.
> [EHL]
> > I am not sure what to do here. Should the fragment be removed from
> the definition of 'uri' in the template vocabulary? That seems like the
> easiest solution (allowing it to be used explicitly with the 'fragment'
> variable).
> >
> >
> OK this would be a big problem for POWDER where we make specific use of
> fragment IDs. The basis of POWDER is that you apply descriptors to
> things defined by string or reg ex matches against a URI (everything on
> example.com, all its subdomains etc.). But content management systems
> don't always arrange things nicely so that the pattern matching can
> work. Our big use case (and WG member) Deutsche Telekom (t-online.de)
> being a case in point (at least for the time being they use numeric
> URIs
> with no discernible pattern).
>
> In those situations we need to link from a resource directly to its DR
> directly which we do using fragment IDs. So you create a POWDER file,
> complete with attribution and a restriction on its applicability to,
> say, t-online.de, and then put an xml:id on a actual descriptor set. A
> resource can then link to that descriptor set with
>
> <link rel="describedby" href="/powder.xml#red"
> type="application/powder+xml" />
>
> (or its HTP equivalent). Note the #red frag. It's not as powerful as
> the
> primary POWDER method of resource grouping but it is something we need
> to support so please, please don't say that a describedby link can't
> have a fragment ID!
>
> Chapter and verse on this is at [1]
>
> Phil.
>
> [1]
> http://www.w3.org/2007/powder/Group/powder-dr/20090120-
> diff.html#directDescript