|
View:
New views
7 Messages
—
Rating Filter:
Alert me
|
|
|
URGENT: Preparing for next round of TAG Review of HTML 5TAG members:
We are at our deadline date for readthroughs of assigned sections [1] of the HTML 5 draft, and we have approximately 3 weeks until the start of our September F2F. This email sets out the steps I would like all TAG members to take now to prepare the input we will need for our discussions. What I'm asking each TAG member to do: ====================================== Please send us >now< a list of specific issues with the HTML 5 drafts that you believe merit group consideration by the TAG, or else an explicit indication that you do not wish to suggest any. Send at least an initial list of your most critical issues in time for the call on Thurs (3 Sept). (I believe I requested this last week, so you have had some notice. If you can't make the date, then please suggest a date you can make; late input will be considered.) Please give priority to issues that significantly impact the integrity and health of the Web, those that have the most important architectural implications, and those that relate to consistency with other specifications. The TAG may or may not decide to provide input on smaller issues as well, but we will do that on a time-available basis after dealing with any that clearly have TAG scope. For each issue you identify: * Indicate whether you believe it to be a high priority for TAG consideration, and if so why (or why not). * If possible, identify specific text or sections in the HTML 5 draft that is causing concern, or if many parts of the draft are pertinent, highlight one or two representative specifics. * If possible, suggest a resolution. History suggests that proposing revised text can be constructive in many cases. Please do list any issues that the TAG has already discussed, such as version identifiers, if you continue to believe they are important, and indicate their priority relative to others. What the TAG as a whole will do: ================================ I intend to gather this input and schedule very brief discussion of each issue that's proposed as being highest importance. The initial goal will be go get TAG consensus on which issues to give priority, then to assign several TAG members to dive deeper on each, making sure the concern is valid, proposing positions that the TAG might take, etc. While this is going on, we will continue discussion of some of the issues that have already come up, including suggestions for explicit version indicators, concerns relating to content type sniffing, etc. Much of the time at the Sept. F2F will be spent gathering and refining our analysis and preparing comments (if any). A note to members of the HTML community ======================================= I am suggest that the TAG do as much of this work in public as possible, send issues lists to this public mailing list, etc. That has the advantage that everyone can follow our deliberations, but there's a real risk of people getting upset too early about things they see coming up. Please don't. The point is to give the TAG a chance to discuss and deliberate before providing feedback to you. Also: don't assume that because we're doing a careful, detailed review that the TAG will ultimtely decide to make formal comments. We may or may not, now or later. By all means, if you have constructive suggestions or can clear up misunderstandings as our discussions proceed, dive in. Beyond that, please give the TAG a chance to come to well reasoned decisions before you get too concerned about issues that may have been raised by individual TAG members. Thank you. Noah [1] http://lists.w3.org/Archives/Member/tag/2009Jul/0026.html P.S. Tracker: this relates to TAG ISSUE-54 -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 -------------------------------------- |
|
|
Re: URGENT: Preparing for next round of TAG Review of HTML 5I did some of my homework re HTML5. I had some comments and questions
on section 2.4 Section 2.4 describes several datatypes. The syntax for these datatypes is described informally. Q1. Why not use BNF to describe the syntax? The section includes algorithms for parsing the string representation. Q2. Why are these algorithms required? Typically, it is hard to get the bugs out of them. Larry say they are for conformance/consistency. If so, why not just reference standard works such as ISO 8601 or IEEE 754. Q3. Does HTML5 convert the string representation to binary for, say, floating point numbers? If so, I'm sure, implementations just use the native language libraries such as the java Math library. Why not just refer to these? Note that XML Schema covers much of the same ground and may be a good reference. All the best, Ashok noah_mendelsohn@... wrote: > TAG members: > > We are at our deadline date for readthroughs of assigned sections [1] of > the HTML 5 draft, and we have approximately 3 weeks until the start of our > September F2F. This email sets out the steps I would like all TAG members > to take now to prepare the input we will need for our discussions. > > > What I'm asking each TAG member to do: > ====================================== > > Please send us >now< a list of specific issues with the HTML 5 drafts that > you believe merit group consideration by the TAG, or else an explicit > indication that you do not wish to suggest any. Send at least an initial > list of your most critical issues in time for the call on Thurs (3 Sept). > (I believe I requested this last week, so you have had some notice. If > you can't make the date, then please suggest a date you can make; late > input will be considered.) > > Please give priority to issues that significantly impact the integrity and > health of the Web, those that have the most important architectural > implications, and those that relate to consistency with other > specifications. The TAG may or may not decide to provide input on > smaller issues as well, but we will do that on a time-available basis > after dealing with any that clearly have TAG scope. > > For each issue you identify: > > * Indicate whether you believe it to be a high priority for TAG > consideration, and if so why (or why not). > > * If possible, identify specific text or sections in the HTML 5 draft that > is causing concern, or if many parts of the draft are pertinent, highlight > one or two representative specifics. > > * If possible, suggest a resolution. History suggests that proposing > revised text can be constructive in many cases. > > Please do list any issues that the TAG has already discussed, such as > version identifiers, if you continue to believe they are important, and > indicate their priority relative to others. > > > What the TAG as a whole will do: > ================================ > > I intend to gather this input and schedule very brief discussion of each > issue that's proposed as being highest importance. The initial goal will > be go get TAG consensus on which issues to give priority, then to assign > several TAG members to dive deeper on each, making sure the concern is > valid, proposing positions that the TAG might take, etc. > > While this is going on, we will continue discussion of some of the issues > that have already come up, including suggestions for explicit version > indicators, concerns relating to content type sniffing, etc. > > Much of the time at the Sept. F2F will be spent gathering and refining our > analysis and preparing comments (if any). > > > A note to members of the HTML community > ======================================= > > I am suggest that the TAG do as much of this work in public as possible, > send issues lists to this public mailing list, etc. That has the > advantage that everyone can follow our deliberations, but there's a real > risk of people getting upset too early about things they see coming up. > Please don't. The point is to give the TAG a chance to discuss and > deliberate before providing feedback to you. Also: don't assume that > because we're doing a careful, detailed review that the TAG will ultimtely > decide to make formal comments. We may or may not, now or later. > > By all means, if you have constructive suggestions or can clear up > misunderstandings as our discussions proceed, dive in. Beyond that, > please give the TAG a chance to come to well reasoned decisions before you > get too concerned about issues that may have been raised by individual TAG > members. > > Thank you. > > Noah > > [1] http://lists.w3.org/Archives/Member/tag/2009Jul/0026.html > > P.S. Tracker: this relates to TAG ISSUE-54 > > -------------------------------------- > Noah Mendelsohn > IBM Corporation > One Rogers Street > Cambridge, MA 02142 > 1-617-693-4036 > -------------------------------------- > > > > > |
|
|
Re: URGENT: Preparing for next round of TAG Review of HTML 5Thank you. I will be sure that we discuss these concerns, either tomorrow
or within the next week or so. FWIW: I agree with your conern about algorithms (imperative) vs. declarative expositions. Regarding references to external specifications for numbers etc., I obviously can't speak for the HTML 5 WG, but I suspect that their top priority is exact compatibility with existing HTML deployments, and I would guess that referencing other specifications would require one to prove that those references match what HTML does in all respects. Also, as much as I prefer declarative expositions, I believe that at times the HTML draft depends on things like knowing where input pointers are left after certain things are parsed or certain errors are encountered, and that presumably is not in all cases covered by the external specs. Noah -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 -------------------------------------- ashok malhotra <ashok.malhotra@...> Sent by: www-tag-request@... 09/02/2009 10:39 PM Please respond to ashok.malhotra To: noah_mendelsohn@... cc: www-tag@... Subject: Re: URGENT: Preparing for next round of TAG Review of HTML 5 I did some of my homework re HTML5. I had some comments and questions on section 2.4 Section 2.4 describes several datatypes. The syntax for these datatypes is described informally. Q1. Why not use BNF to describe the syntax? The section includes algorithms for parsing the string representation. Q2. Why are these algorithms required? Typically, it is hard to get the bugs out of them. Larry say they are for conformance/consistency. If so, why not just reference standard works such as ISO 8601 or IEEE 754. Q3. Does HTML5 convert the string representation to binary for, say, floating point numbers? If so, I'm sure, implementations just use the native language libraries such as the java Math library. Why not just refer to these? Note that XML Schema covers much of the same ground and may be a good reference. All the best, Ashok noah_mendelsohn@... wrote: > TAG members: > > We are at our deadline date for readthroughs of assigned sections [1] of > the HTML 5 draft, and we have approximately 3 weeks until the start of our > September F2F. This email sets out the steps I would like all TAG members > to take now to prepare the input we will need for our discussions. > > > What I'm asking each TAG member to do: > ====================================== > > Please send us >now< a list of specific issues with the HTML 5 drafts that > you believe merit group consideration by the TAG, or else an explicit > indication that you do not wish to suggest any. Send at least an initial > list of your most critical issues in time for the call on Thurs (3 Sept). > (I believe I requested this last week, so you have had some notice. If > you can't make the date, then please suggest a date you can make; late > input will be considered.) > > Please give priority to issues that significantly impact the integrity and > health of the Web, those that have the most important architectural > implications, and those that relate to consistency with other > specifications. The TAG may or may not decide to provide input on > smaller issues as well, but we will do that on a time-available basis > after dealing with any that clearly have TAG scope. > > For each issue you identify: > > * Indicate whether you believe it to be a high priority for TAG > consideration, and if so why (or why not). > > * If possible, identify specific text or sections in the HTML 5 draft > is causing concern, or if many parts of the draft are pertinent, highlight > one or two representative specifics. > > * If possible, suggest a resolution. History suggests that proposing > revised text can be constructive in many cases. > > Please do list any issues that the TAG has already discussed, such as > version identifiers, if you continue to believe they are important, and > indicate their priority relative to others. > > > What the TAG as a whole will do: > ================================ > > I intend to gather this input and schedule very brief discussion of each > issue that's proposed as being highest importance. The initial goal will > be go get TAG consensus on which issues to give priority, then to assign > several TAG members to dive deeper on each, making sure the concern is > valid, proposing positions that the TAG might take, etc. > > While this is going on, we will continue discussion of some of the issues > that have already come up, including suggestions for explicit version > indicators, concerns relating to content type sniffing, etc. > > Much of the time at the Sept. F2F will be spent gathering and refining our > analysis and preparing comments (if any). > > > A note to members of the HTML community > ======================================= > > I am suggest that the TAG do as much of this work in public as possible, > send issues lists to this public mailing list, etc. That has the > advantage that everyone can follow our deliberations, but there's a real > risk of people getting upset too early about things they see coming up. > Please don't. The point is to give the TAG a chance to discuss and > deliberate before providing feedback to you. Also: don't assume that > because we're doing a careful, detailed review that the TAG will ultimtely > decide to make formal comments. We may or may not, now or later. > > By all means, if you have constructive suggestions or can clear up > misunderstandings as our discussions proceed, dive in. Beyond that, > please give the TAG a chance to come to well reasoned decisions before you > get too concerned about issues that may have been raised by individual TAG > members. > > Thank you. > > Noah > > [1] http://lists.w3.org/Archives/Member/tag/2009Jul/0026.html > > P.S. Tracker: this relates to TAG ISSUE-54 > > -------------------------------------- > Noah Mendelsohn > IBM Corporation > One Rogers Street > Cambridge, MA 02142 > 1-617-693-4036 > -------------------------------------- > > > > > |
|
|
On BNF and other formal notations in the HTML 5 spec [... TAG Review of HTML 5]On Wed, 2009-09-02 at 19:39 -0700, ashok malhotra wrote:
> I did some of my homework re HTML5. I had some comments and questions > on section 2.4 > > Section 2.4 describes several datatypes. The syntax for these datatypes > is described informally. > > Q1. Why not use BNF to describe the syntax? Editorial style. I don't like the aversion to formalisms, but he seems to satisfy at least some large part of the readership without using them, and noone has supplied a combination of BNF and error handling rules that would serve in place of the prose he uses. I happened to discuss this with him even before the current HTML WG started: [[ On Wed, 22 Feb 2006, Dan Connolly wrote: > > On Mon, 2006-01-09 at 07:05 +0000, Ian Hickson wrote: > [...] > > Personally I would discourage the use of BNF, however, as it makes it very > > difficult to define error handling rules, and specifications often forget > > to define how to go from the parsed tree to the semantics that the > > specification defines, leaving it up to UA implementors to work out the > > implied mapping. > > Defining error handling rules is tricky, no doubt. But I wonder why > you say that BNF makes it more so. What do you prefer? Prose. ]] -- http://lists.w3.org/Archives/Public/www-qa/2006Feb/0014.html A similar argument argues against a schema for the language: "The experience with HTML 4 suggests that designating a normative schema causes people to use it and to ignore the machine-checkable conformance criteria that the schema does not embody. To avoid that kind of situation, it would be better not to designate a normative schema and instead consider schemata implementation details just like one would consider particular lines of C++ implementation details." -- Henri Sivonen 23 Mar 2007 http://lists.w3.org/Archives/Public/public-html/2007JanMar/0357.html -- Dan Connolly, W3C http://www.w3.org/People/Connolly/ gpg D3C2 887B 0F92 6005 C541 0875 0F91 96DE 6E52 C29E |
|
|
Re: URGENT: Preparing for next round of TAG Review of HTML 5noah_mendelsohn writes:
> [send input] I'm behind, but this is going more slowly than I had hoped, partly because I'm finding the structure of the document inimical to finding answers to my questions easily -- that's in itself an issue, but one I can't formulate concretely yet. . . I've attached my raw notes, which are complete up through the end of section 2.4. Specific issues meriting consideration: 1.4 implies XForms _could not_ be reconstructed within HTML, which is a best contentious and at worst manifestly false, and appeals to an unidentified decision about "the previously chosen direction for the Web's evolution". 1.7 Is HTML5 two things or three out of the following?: 1) An abstract language; 2) In-memory representations of resources that use that abstract language; 3) Concrete syntax 2.2 Document conformance and implementation performance are in principle decoupled, with the consequence that every document-content 'must' has to be checked against the parser, and every 'parse error' or algorithm 'fail' or 'abort' has to be checked against the document constraints. Which, if either, of these is the so-called authoring spec. based on? 2.4.2, 2.4.3 -- two changes from XHTML/HTML 4.01, one more restrictive, one less -- a general issue -- are these changes a) tabulated anywhere, b) motivated? [no specific locus] There's a strong implication, if not an explicitly stated requirement, that only character sequences are *XML documents*. This appears to rule out the possibility of conformant processing of XHTML in e.g. a pipeline processor. There is a more general problem in that the word 'document' is used both of character sequences and of DOM *Documents*, and it is not always clear what constraints apply to what. ht 1.1 "the HTML specifications" -- raises the question of scope -- just what documents is this one intended to supersede - by its editor? - by the HTML5 WG? - by the W3C? 1.2 Scope again -- "tools that are intended to conform to this specification" is content-free! 1.3 The applications paragraph -- this is what you can _build_ on what is specified here? 1.4 "without requiring browsers to implement rendering engines that were incompatible with existing HTML Web pages." -- implies XForms _did_ require this -- true? "The proposal was rejected on the grounds that the proposal conflicted with the previously chosen direction for the Web's evolution." -- Anyone have a reference for this? 1.5.1 "Serializability of script execution" - what a _very_ odd thing to start with! 1.6.2 "Thus, authors and implementors who do not need such a modularization scheme can consider this specification a replacement for XHTML 1.x, but those who do need such a mechanism are encouraged to continue using the XHTML 1.1 line of specifications." 1.7 Two things, or three: 1) An abstract language; 2) In-memory representations of resources that use that abstract language; 3) Concrete syntax ? 1.9 Elements (abstract?) are denoted by tags (concrete). So this is only a non-normative "quick introduction", but the very strong emphasis on the DOM as the fundamental core of things is odd. It leaves out non-DOM-based applications, in particular any use with generic XML-based tools, and foregrounds inline script modifying an element, which is at best questionable. . . "The value can also be omitted altogether if it is empty.": <foo baz= bar="a"/> ??? The example given directly contradicts HTML 4.01: Example says <input name=address disabled> is equivalent to <input name=address disabled=""> HTML4.01 says it's equivalent to <input name=address disabled="disabled"> See also below on 2.4.2 2.1.2 [minor] The use of typewrite font for DOM object classnames is not explained, and runs counter to the W3C spec. guidelines for accessibility, as it is presents a semantic distinction in a non-accessible way. 2.1.6 'resource' is used where 'representation' would be more consistent with AWWW/TAG usage. I think this has been raised elsewhere already. 2.2 The appearance of a script element in an XML document 'within a transformation expressed in XSLT' is called out for special not-as-specified-by-this-spec. treatment. But surely that applies to _all_ HTML elements found in stylesheets. . . Maybe the vertical bar is meant to suggest that this is an _example_ of "the semantics of [HTML] elements [being] overridden by other specifications." 2.2 I don't understand the difference between 'static' and 'dynamic' non-interactive user agents. The example doesn't help -- what properties are being assumed for "overhead displays"? 2.2 I can't figure out what this implies -- a 'for instance' would help a lot: "For the parts of this specification that are defined in terms of an events model or in terms of the DOM, [non-scripting] user agents must still act as if events and the DOM were supported." 2.2 I think this is too strong: "Authoring tools and markup generators must generate conforming documents" It's OK in my view to output well-formed-but-not-valid XML from an XML editor, for instance as an intermediate stage during authoring. 2.2 As I read the fifth-from-last para. and back at the beginning the fourth para. andthe Note thereafter, the decoupling of document from implementation conformance means that for every 'must' wrt document structure, there may be a corresponding 'parse error' or there may be what amounts to a preemptive recovery strategy. I'm curious to know whether and if so how often such disconnects arise. . . Boolean attributes appear to be a case of this. 2.2 I agree with the questions raised in existing threads about the implicit "XHTML MUST NOT be served as text/html" prohibition here. 2.2 "Entity references to unknown entities must be treated as if they contained just an empty text node for the purposes of the algorithms defined in this specification." Surely this should be "for the purposed of implementation conformance", to avoid possible confusion wrt document conformance, where unknown entities MUST NOT occur. 2.2.1 XML support should be mandated as no less than 4th edition, and allowed for higher. . . Likewise "support some version" of the DOM should be more precise. "Some parts of the language described by this specification only support JavaScript as the underlying scripting language." Hunh? For instance? Why? 2.4.2 This repeats the change to allow e.g. disabled="" -- I guess there's some implementation precedent -- this is a classic case of dumbing-down :-( A quick check suggests that _any_ value (including 'false') is treated as present in recent FF, IE, Opera, so I _really_ don't understand the motivation for this . . . validator.nu rejects disabled="foo" but accepts disabled="" as HTML5, rejects both as HTML4 Maybe this is a good way in to a complex issue -- 2.4.2 uses 'must' language, and the traditional reference to RFC2119 is present. We also find the following in 2.2 Conformance, near the end: "Some conformance requirements are phrased as requirements on elements, attributes, methods or objects. Such requirements fall into two categories: those describing content model restrictions, and those describing implementation behavior. Those in the former category are requirements on documents and authoring tools. Those in the second category are requirements on user agents." So we do get that in this case conforming documents must include boolean attributes in only one of three forms, e.g. "disabled", "disabled=''" or "disabled='disabled'", and conformance checkers have to detect and signal failures to observe this constraint. In a related point, the first of these is not XML-allowed, but this is not called out -- indeed the status of all of 2.4 vis-a-vis XHTML is unclear to me. According to the discussion in the section para of 2.1, this section should say "does not apply to XHTML" Another thing I don't see, after considerable searching, particularly in what I take to be the relevant part of the parsing algorithm, namely 9.4.2 Tokenization, particularly 9.4.2.5--9.4.2.15, and I saw nothing which would handle boolean attributes specifically at all. The DOM _interface_ to attribute reflected properties for them is well specified (2.8.1), but that is separate, I think. 2.4.3 Another change from HTML 4 and XHTML: "If an enumerated attribute is specified, the attribute's value must be an ASCII case-insensitive match for one of the given keywords that 2.4.2 Boolean attributes 2.4.3 Keywords and enumerated attributes are not said to be non-conforming, with no leading or trailing whitespace." In HTML 4/XHTML, enumerated attrs are whitespace-stripped before being checked. Why has HTML5 gotten stricter here? I note that in the next section leading/trailing whitespace _is_ ignored around numbers. . . -- Henry S. Thompson, School of Informatics, University of Edinburgh Half-time member of W3C Team 10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 651-1426, e-mail: ht@... URL: http://www.ltg.ed.ac.uk/~ht/ [mail really from me _always_ has this .sig -- mail without it is forged spam] |
|
|
datatypes in HTML 5, Java, and XML Schema and the principle of well-defined behavior [... TAG Review of HTML 5]On Wed, 2009-09-02 at 19:39 -0700, ashok malhotra wrote:
> I did some of my homework re HTML5. I had some comments and questions > on section 2.4 > > Section 2.4 describes several datatypes. The syntax for these datatypes > is described informally. [...] > Q2. Why are these algorithms required? Typically, it is hard to get > the bugs out of them. > Larry say they are for conformance/consistency. If so, why not just > reference standard works such as > ISO 8601 or IEEE 754. > > Q3. Does HTML5 convert the string representation to binary for, say, > floating point numbers? > If so, I'm sure, implementations just use the native language libraries > such as the java Math library. > Why not just refer to these? > > Note that XML Schema covers much of the same ground and may be a good > reference. The problem is that the details of the way these datatypes are implemented in the web platform don't quite match Java or XML Schema. For example, in Javascript, parseInt("1a1") gives 1 (try it yourself at http://www.squarefree.com/shell/shell.html ) but in Java it throws an exception: java.lang.NumberFormatException: java.lang.NumberFormatException: For input string: "1a1" It's somewhat traditional to say that cases like "1a1" are out of scope and leave them implementation-defined, but that goes against one of the principles of the HTML 5 effort: "Prefer to clearly define behavior that content authors could rely on, in preference to vague or implementation-defined behavior. This way, it is easier to author content that works in a variety of user agents." http://www.w3.org/TR/html-design-principles/#well-defined-behavior And yes, it's hard to get the bugs out of specifications of this style. Given the number of details and the interactions between them, my mind boggles at the size of the test suite that would give me confidence about interoperability. Numbers like 50,000 tests get thrown around. Considering that XQuery's test suite was about that big and XQuery is more regular (having been designed rather than reverse engineered), even that many will leave lots of holes. I found the "1a1" case in test materials just for number parsing; it's 2773 lines long... about 250 test cases. http://hg.gsnedders.com/php-html-5-direct/file/8c27462f5f41/tests/numbersTest Implementation + Test Cases Available For Numbers Subsection of Common Microsyntaxes Geoffrey Sneddon 12 Jul 2007 http://lists.w3.org/Archives/Public/public-html/2007Jul/0650.html -- Dan Connolly, W3C http://www.w3.org/People/Connolly/ gpg D3C2 887B 0F92 6005 C541 0875 0F91 96DE 6E52 C29E |
|
|
Re: datatypes in HTML 5, Java, and XML Schema and the principle of well-defined behavior [... TAG Review of HTML 5]On 3 Sep 2009, at 20:39, Dan Connolly wrote: > I found the "1a1" case in test materials just for number parsing; > it's 2773 lines long... about 250 test cases. > http://hg.gsnedders.com/php-html-5-direct/file/8c27462f5f41/tests/numbersTest > > Implementation + Test Cases Available For Numbers Subsection of > Common > Microsyntaxes > Geoffrey Sneddon > 12 Jul 2007 > http://lists.w3.org/Archives/Public/public-html/2007Jul/0650.html That's out of date compared with the spec, FWIW. -- Geoffrey Sneddon <http://gsnedders.com/> |
| Free embeddable forum powered by Nabble | Forum Help |