Guidelines on usage of // in new URI schemes

View: New views
4 Messages — Rating Filter:   Alert me  

Guidelines on usage of // in new URI schemes

by Eran Hammer-Lahav :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I am in the process of proposing a new URI scheme to identify user accounts [1]. This is part of the WebFinger protocol [2] effort.

This email is *not* an invitation to debate the merits of this new URI scheme (just yet). I am sure we will have many lively discussions about it shortly but I would like to present a proposal before we have a public debate about it here.

The new scheme has two components, a local identifier (username, screenname, handle, etc.) and a host (which can resolve and authenticate the local identifier). When looking at the URI specification (RFC 3986) and at the new URI guidelines (BCP 35), it is hard to figure out what is an appropriate use of // in new schemes.

In this case, we have a requirement to keep the URI (the part after the scheme:) looking as close to an RFC-822 identifier (username@host) and that means two options:

acct:username@host
acct://username@host

The 'username@host' part seems to fit perfectly into the URI authority as defined by RFC 3986. However, since the URI does not have a path, it does not really contain a hierarchical structure (just the top level host).

The benefit of using // in this case is that existing URI parsing code can be used unmodified to process the new URI. It is a simple profile which only allows the userinfo and host subcomponents of the authority component, and no other URI components. Since the new scheme will be often used with URI templates and other facilities often used with http: URIs, it is very convenient to have a common structure (even if it is only a subset). I don't see any down side to using // other than defying expectations established by the mailto: URI scheme.

The benefit of not using // is that it makes the URI follow the well establish pattern in mailto: and save two bytes. The down side is that it requires spelling out how to break the URI path into sub components specific to this scheme.

So far the feedback I received is focus on style which is perfectly valid, but I want to make sure I am not missing anything. My preference is to reuse as much as possible and therefore include the //.

Any suggestions?

EHL

[1] http://www.hueniverse.com/hueniverse/2009/08/making-the-case-for-a-new-acct-uri-scheme-for-accounts.html
[2] http://code.google.com/p/webfinger


Re: Guidelines on usage of // in new URI schemes

by Graham Klyne-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

[Sent originally in response to IETF-apps, but I realize it's probably more
usefully sent here.]

> On Wed, 2009-08-19, Eran Hammer-Lahav wrote:
>> I am in the process of proposing a new URI scheme to identify user
>> accounts [1]. This is part of the WebFinger protocol [2] effort.

[...]

>> acct:username@host
>> acct://username@host

[...]

>> Any suggestions?

The '//' is used to introduce an "authority" component in a URI.  A URI without
an authority resolved against a base with one would acquire the base authority
(from
RFC 3986, pp31-32, also section 5.4.1 example "/g"); e.g.

base URI:
   foo://auth/path

URI reference:
   /otherpath

would resolve to:
   foo://auth/otherpath

If your URI scheme consists of *only* an authority (which I think is your stated
intent) then the distinction is probably moot.

BUT, the thought that occurs to me is that there are multiple notions of
authority here (e.g. the authority of the user-account issuer vs the authority
of a user to post messages).

Where this leads me is to a *third* design along the lines of:

   foo://user-issuing-authority/username

The main advantage of this that I see is that there is shed-loads of code out
there that will parse this form of URI and present you with the pieces (i.e.
pretty much every URI handling library in every language, not to mention the
regex in RFC 3986, appendix B).

A second advantage is that there is further scope for future extension to, say,
hierarchical user structures - which I recognize is not part of the current scope.

A third possible advantage (with a nod to a comment made by Larry Masinter) is
that by putting the user component in the path rather than the "domain"
authority, some username I18N issues may be easier.  (I'm no expert here, so I
could easily be way wrong, but I seem to recall reading somewhere that
mechanisms for I18N within the authority are likely to diverge from the
widely-used UTF-8 + %-encoding that apply in the path of the URI.)

#g








[Moderator Action] Re: Guidelines on usage of // in new URI schemes

by Timur Shemsedinov :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello

See RFC 2718 - Guidelines for new URL Schemes
http://www.ietf.org/rfc/rfc2718.txt

2.1.2 Improper use of "//" following "<scheme>:"

Contrary to some examples set in past years, the use of double
slashes as the first component of the <scheme-specific-part> of a URL
is not simply an artistic indicator that what follows is a URL:
Double slashes are used ONLY when the syntax of the URL's <scheme-
specific-part> contains a hierarchical structure as described in RFC
2396. In URLs from such schemes, the use of double slashes indicates
that what follows is the top hierarchical element for a naming
authority. (See section 3 of RFC 2396 for more details.) URL
schemes which do not contain a conformant hierarchical structure in
their <scheme-specific-part> should not use double slashes following
the "<scheme>:" string.


On Thu, Aug 20, 2009 at 8:48 AM, Eran Hammer-Lahav <eran@...> wrote:
I am in the process of proposing a new URI scheme to identify user accounts [1]. This is part of the WebFinger protocol [2] effort.

This email is *not* an invitation to debate the merits of this new URI scheme (just yet). I am sure we will have many lively discussions about it shortly but I would like to present a proposal before we have a public debate about it here.

The new scheme has two components, a local identifier (username, screenname, handle, etc.) and a host (which can resolve and authenticate the local identifier). When looking at the URI specification (RFC 3986) and at the new URI guidelines (BCP 35), it is hard to figure out what is an appropriate use of // in new schemes.

In this case, we have a requirement to keep the URI (the part after the scheme:) looking as close to an RFC-822 identifier (username@host) and that means two options:

acct:username@host
acct://username@host

The 'username@host' part seems to fit perfectly into the URI authority as defined by RFC 3986. However, since the URI does not have a path, it does not really contain a hierarchical structure (just the top level host).

The benefit of using // in this case is that existing URI parsing code can be used unmodified to process the new URI. It is a simple profile which only allows the userinfo and host subcomponents of the authority component, and no other URI components. Since the new scheme will be often used with URI templates and other facilities often used with http: URIs, it is very convenient to have a common structure (even if it is only a subset). I don't see any down side to using // other than defying expectations established by the mailto: URI scheme.

The benefit of not using // is that it makes the URI follow the well establish pattern in mailto: and save two bytes. The down side is that it requires spelling out how to break the URI path into sub components specific to this scheme.

So far the feedback I received is focus on style which is perfectly valid, but I want to make sure I am not missing anything. My preference is to reuse as much as possible and therefore include the //.

Any suggestions?

EHL

[1] http://www.hueniverse.com/hueniverse/2009/08/making-the-case-for-a-new-acct-uri-scheme-for-accounts.html
[2] http://code.google.com/p/webfinger
_______________________________________________
Apps-Discuss mailing list
Apps-Discuss@...
https://www.ietf.org/mailman/listinfo/apps-discuss


_______________________________________________
Apps-Discuss mailing list
Apps-Discuss@...
https://www.ietf.org/mailman/listinfo/apps-discuss

Re: Guidelines on usage of // in new URI schemes

by Larry Masinter-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

This was discussed on apps-discuss and the URI list a while back, so I have bcc’d those lists, but I want to focus the discussion on the public-iri@... list, so please only reply there.

 

In order to handle IDNs appropriately, I would like to make the rule that any scheme that allows non-ASCII or pct-encoded values in the “host” field in the generic syntax MUST allow or mandate that IRI -> URI processing follow IDNa rules. That is, no matter what the scheme, if you have

 

scheme://nonascii.name/path/here     as an IRI, and want to translate it to a URI, you MUST use IDNA to turn it into

 

scheme://alabel.for.nonascii.name/ascii.for.path/ascii.for.here

 

no matter what the scheme. This is what you have to do for almost all URI schemes now anyway in order to function properly.

 

This would change the guidelines on use of “//” for new schemes, but are there any URI schemes in use for which this would actually be a problem in practice?

 

Larry

--

http://larry.masinter.net

 

From: apps-discuss-bounces@... [mailto:apps-discuss-bounces@...] On Behalf Of Timur Shemsedinov
Sent: Thursday, August 20, 2009 7:44 AM
To: Eran Hammer-Lahav
Cc: URI; apps-discuss@...
Subject: [Moderator Action] Re: Guidelines on usage of // in new URI schemes

 

Hello

See RFC 2718 - Guidelines for new URL Schemes
http://www.ietf.org/rfc/rfc2718.txt

2.1.2 Improper use of "//" following "<scheme>:"

Contrary to some examples set in past years, the use of double
slashes as the first component of the <scheme-specific-part> of a URL
is not simply an artistic indicator that what follows is a URL:
Double slashes are used ONLY when the syntax of the URL's <scheme-
specific-part> contains a hierarchical structure as described in RFC
2396. In URLs from such schemes, the use of double slashes indicates
that what follows is the top hierarchical element for a naming
authority. (See section 3 of RFC 2396 for more details.) URL
schemes which do not contain a conformant hierarchical structure in
their <scheme-specific-part> should not use double slashes following
the "<scheme>:" string.

On Thu, Aug 20, 2009 at 8:48 AM, Eran Hammer-Lahav <eran@...> wrote:

I am in the process of proposing a new URI scheme to identify user accounts [1]. This is part of the WebFinger protocol [2] effort.

This email is *not* an invitation to debate the merits of this new URI scheme (just yet). I am sure we will have many lively discussions about it shortly but I would like to present a proposal before we have a public debate about it here.

The new scheme has two components, a local identifier (username, screenname, handle, etc.) and a host (which can resolve and authenticate the local identifier). When looking at the URI specification (RFC 3986) and at the new URI guidelines (BCP 35), it is hard to figure out what is an appropriate use of // in new schemes.

In this case, we have a requirement to keep the URI (the part after the scheme:) looking as close to an RFC-822 identifier (username@host) and that means two options:

acct:username@host
acct://username@host

The 'username@host' part seems to fit perfectly into the URI authority as defined by RFC 3986. However, since the URI does not have a path, it does not really contain a hierarchical structure (just the top level host).

The benefit of using // in this case is that existing URI parsing code can be used unmodified to process the new URI. It is a simple profile which only allows the userinfo and host subcomponents of the authority component, and no other URI components. Since the new scheme will be often used with URI templates and other facilities often used with http: URIs, it is very convenient to have a common structure (even if it is only a subset). I don't see any down side to using // other than defying expectations established by the mailto: URI scheme.

The benefit of not using // is that it makes the URI follow the well establish pattern in mailto: and save two bytes. The down side is that it requires spelling out how to break the URI path into sub components specific to this scheme.

So far the feedback I received is focus on style which is perfectly valid, but I want to make sure I am not missing anything. My preference is to reuse as much as possible and therefore include the //.

Any suggestions?

EHL

[1] http://www.hueniverse.com/hueniverse/2009/08/making-the-case-for-a-new-acct-uri-scheme-for-accounts.html
[2] http://code.google.com/p/webfinger
_______________________________________________
Apps-Discuss mailing list
Apps-Discuss@...
https://www.ietf.org/mailman/listinfo/apps-discuss