|
View:
New views
2 Messages
—
Rating Filter:
Alert me
|
|
|
IRI issues (in quite some detail)This is a laundry list of issues that have come up on the IRI spec
update. They are grouped into things that are related where possible. I hope this is a fairly complete initial pass, but I'm sure there are still a few things missing. In your replies, please distinguish addition of issues from discussion of specific issues. IRIs and IDNA ============= - %encoding vs. punycode when converting from IRI to URI (see mail by Roy: http://lists.w3.org/Archives/Public/public-iri/2009Aug/0010.html and I-D by Dave Thaler: http://tools.ietf.org/html/draft-iab-idn-encoding) - Update of Bidi section: - allow combining marks at end of component - adopt component restrictions to those in [IDNA-Bidi] - check about other syntactic characters (not only dot) and payload characters (e.g. %) [- rework examples] - IDNA 2003 vs. IDNA 2008: - to map or not to map for IRI->URI and on resolution in general - what mapping to use (see http://www.unicode.org/reports/tr46/ for a potential direction) - what to do about ß (sharp s) and ς (final sigma) - short term - long term - advice for authors: - Always use prepped (in IDNA 2003 termiology) or legal U-Label (in IDNA 2008 terminology) - Avoid separators other than '.' - Avoid IDNs that are not legal in either IDNA 2003 or 2008 ? LEIRIs and HTML5 references =========================== - Are there other "main areas" (like XML and HTML) that warrant similar 'preferential treatment' [let's really hope not] (see also http://www.w3.org/International/iri-edit/spec-use-survey.html (way incomplete)) - Naming these explicitly (or not) - What's the best name for HTML5 references - Using syntax or procedure for definition (syntax seems to work better for the requirements of XML and LEIRIs, procedure may work better for HTML5) - Place in spec: Appendix? Separate section (for each, or for both together?)? As part of a section 5 (Normalization and Comparison; probably not, seems confusing to many people) - Mix with main IRI->URI procedure or not (ideally separate, but may not be easy for some aspects) - What to keep in 'host' specs (e.g. definition of whitespace?) HTML5 reference specific issues =============================== - '\' as path separator - '#' in fragment identifiers - '[' and ']' other than for IPv6 literals - Processing of other characters not allowed - treatment of lonely '%' (not followed by 2 hex digits) - special behavior for encoding in http: and https: query parts (use document encoding if available instead of UTF-8) - some more (to be completed, including pointer to relevant documents (from Anne) - How to advise authors,... against using 'bugwards-compatible' features (completed for LEIRIs, needs to be discussed and done for HTML5) IRI issues ========== (at http://www.w3.org/International/iri-edit/, not already mentioned above) - http://www.w3.org/International/iri-edit/#identity-101 - http://www.w3.org/International/iri-edit/#transcodeNFC-103 Registration issues =================== - Allow definition of URI schemes simply in terms of IRIs? - What other adjustments needed resulting from issues above? Issues for individual schemes ============================= - Piggibacking mailto: - Allowing UTF-8 officially where current email infrastructure does allow it - Fixing other issues in mailto: - Updating mailto: for EAI (or creating a new scheme) - Others? URI issues (potentially)? ========== - do '[' and ']' need to be forbidden in URIs - does '#' need to be forbidden in URI fragment parts Regards, Martin. -- #-# Martin J. Dürst, Professor, Aoyama Gakuin University #-# http://www.sw.it.aoyama.ac.jp mailto:duerst@... |
|
|
Re: IRI issues (in quite some detail)* Martin J. Dürst wrote:
>URI issues (potentially)? >========== >- do '[' and ']' need to be forbidden in URIs >- does '#' need to be forbidden in URI fragment parts I do not think these are worth considering, there are existing technolo- gies that use them precisely because they have been forbidden to distin- guish resource identifiers from other things in protocol elements, e.g. XML Schema uses a "##identifier" syntax and "CURIEs" use "[identifier]". Retroactively allowing them would do little more than cause confusion. -- Björn Höhrmann · mailto:bjoern@... · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ |
| Free embeddable forum powered by Nabble | Forum Help |