|
View:
New views
2 Messages
—
Rating Filter:
Alert me
|
|
|
|
|
|
Re: ws: and wss: schemesOn Thu, 17 Sep 2009, Julian Reschke wrote:
> > It now says: > > > Encoding considerations. > > Characters in the host component that are excluded by the syntax > > defined above must be converted from Unicode to ASCII by applying > > the IDNA ToASCII algorithm to the Unicode host name, with both the > > AllowUnassigned and UseSTD3ASCIIRules flags set, and using the > > result of this algorithm as the host in the URI. > > > > Characters in other components that are excluded by the syntax > > defined above must be converted from Unicode to ASCII by first > > encoding the characters as UTF-8 and then replacing the > > corresponding bytes using their percent-encoded form as defined in > > the URI and IRI specification. [RFC3986] [RFC3987] > > I think that's good, except that the mention of IRI in the last sentence > seems to be superfluous. RFC3986 already defines everything that is > needed here. Or is there something specific from the IRI spec you think > is relevant? (In which case it should state that more clearly). IRIs is helpful, even if not strictly necessary. (As Martin later pointed out, though, in general, how to convert Unicode to UTF-8 to percent escapes appears to be defined in 3987, not 3986.) On Fri, 18 Sep 2009, "Martin J. D�rst" wrote: > > I think this has various problems. > > First, it is fixed to IDNA 2003 (I think I may have said this already). > IDNA 2008 is around the door. It doesn't use terms such as "ToASCII" or > "AllowUnassigned". What are the magic terms that we should use instead? (This will affect HTML5 also; any advice on how to fix the terminology there would be very welcome also.) > Second, if this is about resolution (rather than about generic > conversion), and because this is a new scheme, it should not exclude the > case that some part of a domain name (reg-name) is percent-encoded, > because both RFC 3986 and 3987 allow this. Not sure what you mean here. > Third, wording this as "characters" seems to say that this is a > character-by-character operation, or that it might be applied to > subsequent non-ASCII characters in groups, but ToASCII, when used, has > to be applied to whole labels, not characters. The paragraph applies it to the whole hostname. > Fourth, as http://tools.ietf.org/html/draft-iab-idn-encoding-00 shows in > more detail, assuming that DNS is always used for resolution of > reg-names, and the technology will never be used e.g. on intranets with > other resolution services seems to be unnecessarily restrictive. Not sure what you mean here. > Ideally, all the above points should be addressed by some work on the > IRI front (public-iri@... cc'ed), but that work isn't done yet. That would indeed be ideal. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.' |
| Free embeddable forum powered by Nabble | Forum Help |