Char comment wrong?
In the XML specification, the Char production at
http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Char says:
[2] Char ::= #x9 | #xA | #xD
| [#x20-#xD7FF] | [#xE000-#xFFFD]
| [#x10000-#x10FFFF]
/* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
The comment appears to be inconsistent:
- Unicode appears to include all ASCII control characters. (E.g.,
code points U+0000 through U+001F all are assigned in the chart at
http://unicode.org/charts/PDF/U0000.pdf.)
- The Char production excludes some (most) of those control characters.
- The comment lists exclusions (to start with the set "any Unicode
character" and narrow it down to the correct set).
- However, the comment does not mention the excluded control characters.
Daniel
--
(Plain text sometimes corrupted to HTML "courtesy" of Microsoft Exchange.) [F]