|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
FW: New Version Notification for draft-duerst-iri-bis-07Due to some personal difficulties, the split of the document
into three parts (parsing, domain names, BCP on character handling, BIDI, etc.) didn't happen. However, Martin did heroically get a new draft out based on some if the interim work. -----Original Message----- From: IETF I-D Submission Tool [mailto:idsubmission@...] Sent: Monday, October 26, 2009 6:08 AM To: duerst@... Cc: michel@...; Larry Masinter Subject: New Version Notification for draft-duerst-iri-bis-07 A new version of I-D, draft-duerst-iri-bis-07.txt has been successfuly submitted by Martin Duerst and posted to the IETF repository. Filename: draft-duerst-iri-bis Revision: 07 Title: Internationalized Resource Identifiers (IRIs) Creation_date: 2009-10-26 WG ID: Independent Submission Number_of_pages: 58 Abstract: This document defines the Internationalized Resource Identifier (IRI) protocol element, as an extension of the Uniform Resource Identifier (URI). An IRI is a sequence of characters from the Universal Character Set (Unicode/ISO 10646). Grammar and processing rules are given for IRIs and related syntactic forms. In addition, this document provides named additional rule sets for processing otherwise invalid IRIs, in a way that supports other specifications that wish to mandate common behavior for 'error' handling. In particular, rules used in some XML languages (LEIRI) and web applications are given. Defining IRI as new protocol element (rather than updating or extending the definition of URI) allows independent orderly transitions: other protocols and languages that use URIs must explicitly choose to allow IRIs. Guidelines are provided for the use and deployment of IRIs and related protocol elements when revising protocols, formats, and software components that currently deal only with URIs. [RFC Editor: Please remove this paragraph before publication.] This document is intended to update RFC 3987 and move towards IETF Draft Standard. This is an interim version in preparation for the IRI BOF at IETF 76 in Hiroshima. For discussion and comments on this draft, please use the public-iri@... mailing list. The IETF Secretariat. |
|
|
Re: FW: New Version Notification for draft-duerst-iri-bis-07On 2009/10/29 10:20, Larry Masinter wrote:
> Due to some personal difficulties, the split of the document > into three parts (parsing, domain names, BCP on character > handling, BIDI, etc.) didn't happen. However, Martin did > heroically get a new draft out based on some if the > interim work. I admit that I got a new draft out, but I have to strongly deny "heroically". Most of the changes are from Larry, and the only thing I did was to tweak a few things where I had opinions that differed somewhat from Larry, and to submit a draft before the deadline just so that we have something in the repository. Anyway, please have a look and comment! Regards, Martin. -- #-# Martin J. Dürst, Professor, Aoyama Gakuin University #-# http://www.sw.it.aoyama.ac.jp mailto:duerst@... |
|
|
Re: FW: New Version Notification for draft-duerst-iri-bis-07Thanks for the new iri-bis-07 draft. Many of the changes are in the
right direction. It's great that there are detailed steps for conversion between IRIs and URIs (in both directions), but to ensure interoperability (while maintaining security), we need to know how to convert the domain name part of a URI into a DNS packet (or other name lookup protocol). We also need to know how to convert the domain name to the HTTP Host: header. I suppose the HTTP-specific rules should be specified in the HTTP spec(s), but we probably don't want to put DNS-specific rules into the main DNS spec(s), do we? In particular, I'm thinking about the recommendations and rules regarding such things as %2E (%-encoded dot). Although we probably want to recommend "pure" IRIs and URIs (to content producers), we will find mixtures of %-encoded and not-%-encoded text in the real world. We probably need to be a bit more explicit about rules and recommendations for this in the URI <-> IRI conversions (in both directions). In the IRI to URI conversion steps, we now parse the IRI before performing any Punycoding and %-encoding. This matches current implementations. However, I believe we need the analogous change in the URI to IRI conversion steps. I.e. we need to parse the URI and then use a single character encoding (charset) for each URI component (mainly /path and ?query). The current draft says "Re-percent-encode any octet produced in step 2 that is not part of a strictly legal UTF-8 octet sequence." This would break some URIs, since it specifies a per-octet rule rather than the per-component rule. In the IRI to URI conversion, we only have one charset (the "document" charset), but in the URI to IRI conversion, we potentially have more than one charset (e.g. /path is UTF-8 and ?query is GB2312). Such mixtures are rare, and content producers should be warned not to use them, but implementers need to know how to process such exceptions. Erik On Thu, Oct 29, 2009 at 12:18 AM, "Martin J. Dürst" <duerst@...> wrote: > On 2009/10/29 10:20, Larry Masinter wrote: >> >> Due to some personal difficulties, the split of the document >> into three parts (parsing, domain names, BCP on character >> handling, BIDI, etc.) didn't happen. However, Martin did >> heroically get a new draft out based on some if the >> interim work. > > I admit that I got a new draft out, but I have to strongly deny > "heroically". Most of the changes are from Larry, and the only thing I did > was to tweak a few things where I had opinions that differed somewhat from > Larry, and to submit a draft before the deadline just so that we have > something in the repository. > > Anyway, please have a look and comment! > > Regards, Martin. > > -- > #-# Martin J. Dürst, Professor, Aoyama Gakuin University > #-# http://www.sw.it.aoyama.ac.jp mailto:duerst@... > > |
| Free embeddable forum powered by Nabble | Forum Help |