|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 | Next > |
|
|
JSON parser grammarSo i've been looking at the JSON object grammar and have been talking
to brendan and i'm getting somewhat conflicting information. The grammars on json.org and in the ES5 spec both prohibit leading 0's on any number, but the various implementations disagree with this. json2.js (from json.org), ie8, and chrome all support the standard ES octal literal lexer -- eg. JSON.parse("[010]")[0] === 8 SpiderMonkey allows a leading 0 but still interprets it as a decimal value -- eg. JSON.parse("[010]")[0] === 10 It seems to me that the spec needs to be corrected to specify what the behaviour actually is, rather than what we wish it could be. --Oliver _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
Re: JSON parser grammarOn Tue, Jun 2, 2009 at 7:06 PM, Oliver Hunt <oliver@...> wrote:
So i've been looking at the JSON object grammar and have been talking to brendan and i'm getting somewhat conflicting information. Since octal wasn't an official part of
ES3, remains absent from official ES5, and is now explicitly prohibited
from ES5/strict, it is good that it is not specified by JSON. I am
surprised that json2.js accepts the syntax, and even more surprised
that it interprets it as octal. Although the rfc says A JSON parser transforms a JSON text into another representation. AI think the behavior you state of json2.js, ie8, and chrome should be considered a bug. I hesitate to make the same statement about SpiderMonkey, because their behavior falls within both the letter and spirit of the rfc, while maintaining the subset relationship between JSON and EcmaScript. As for how json2.js interprets these numbers -- according to eval's interpretation on the underlying platform. -- Cheers, --MarkM _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
Re: JSON parser grammarOn Jun 2, 2009, at 7:26 PM, Mark S. Miller wrote:
> Since octal wasn't an official part of ES3, remains absent from > official ES5, and is now explicitly prohibited from ES5/strict, it > is good that it is not specified by JSON. I am surprised that > json2.js accepts the syntax, and even more surprised that it > interprets it as octal. Although the rfc says > > A JSON parser transforms a JSON text into another > representation. A > JSON parser MUST accept all texts that conform to the JSON grammar. > A JSON parser MAY accept non-JSON forms or extensions. > I think the behavior you state of json2.js, ie8, and chrome should > be considered a bug. I hesitate to make the same statement about > SpiderMonkey, because their behavior falls within both the letter > and spirit of the rfc, while maintaining the subset relationship > between JSON and EcmaScript. guess it would be in the spirit of the RFC for the ES5 spec to define a JSON grammar that was more (or less) lax than the the RFC, but the ES5 spec itself should not allow variation between implementations that would be considered "valid" as historically any place in ES that has undefined "valid" behaviour has proved to be a compatibility problem later on. Currently I can make a string containing a JSON object that will produce different output (or not produce output at all) across multiple implementations that are all "correct" -- this seems like something that is just inviting disaster. The json.org grammar allows the following set of characters in a string * Any unicode character except ", \, or a control character * \", \\, \/, \b, \f, \n, \r, \t, or \u four-hex-digits The ES5 spec is the same, only it defines "control character" as any character less than 0x20, and drops escaped unicode. I'm inclined to believe that dropping the unicode escaping is likely to be a typo- esque error, the exclusion of control characters seems deliberate but has the effect of disallowing tab characters (among others). My testing seems to imply that mozilla allows all control characters in a JSON string literal including newlines, so i'd like clarification on what is actually allowed. --Oliver _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
RE: JSON parser grammarSee inline
>-----Original Message----- >From: es-discuss-bounces@... [mailto:es-discuss- >bounces@...] On Behalf Of Oliver Hunt >Sent: Tuesday, June 02, 2009 8:59 PM >To: Mark S.Miller >Cc: es-discuss@... >Subject: Re: JSON parser grammar > >On Jun 2, 2009, at 7:26 PM, Mark S. Miller wrote: >I'm not talking about the RFC, i'm talking about the ES5 spec. I >guess it would be in the spirit of the RFC for the ES5 spec to define >a JSON grammar that was more (or less) lax than the the RFC, but the >ES5 spec itself should not allow variation between implementations >that would be considered "valid" as historically any place in ES that >has undefined "valid" behaviour has proved to be a compatibility >problem later on. The intent was for the ES5 JSON grammar to exactly match the JSON RFC grammar. If you think it is different, then you may have found a bug so let's make sure... The ES5 spec intentionally doesn't include the " JSON parser MAY accept non-JSON forms or extensions." language from the RFC but the general extension allowance given in section 16 are probably sufficient to allow a conforming ES5 implementation of JSON.parse to accept non-JSON forms or extension. See more below... >Currently I can make a string containing a JSON >object that will produce different output (or not produce output at >all) across multiple implementations that are all "correct" -- this >seems like something that is just inviting disaster. Examples, please? The intent is that applying JSON.parse to a string containing a valid JSON form should produce an equivalent set of objects on all conforming ES5 implementation. > >The json.org grammar allows the following set of characters in a string > * Any unicode character except ", \, or a control character > * \", \\, \/, \b, \f, \n, \r, \t, or \u four-hex-digits > >The ES5 spec is the same, only it defines "control character" as any >character less than 0x20, The JSON RFC also defines control character in this way: " All Unicode characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F)." >and drops escaped unicode. No it doesn't (from the grammar in 15.12.1.1: JSONStringCharacter :: JSONSourceCharacter but not double-quote " or backslash \ \ JSONEscapeSequence JSONEscapeSequence :: JSONEscapeCharacter UnicodeEscapeSequence <------------ I'm inclined to >believe that dropping the unicode escaping is likely to be a typo- >esque error, the exclusion of control characters seems deliberate but >has the effect of disallowing tab characters (among others). It identically matches the RFC >My >testing seems to imply that mozilla allows all control characters in a >JSON string literal including newlines, so i'd like clarification on >what is actually allowed. Step 2 of 15.12.1 (JSON.parse) seems pretty clear in this regard: 2. Parse JText using the grammars in 15.12.1. Throw a SyntaxError exception if the JText did not conform to the JSON grammar for the goal symbol JSONValue. A string containing control characters does not does not conform JSONString so a SyntaxError should be thrown. However, section 16 says: " all operations (...) that are allowed to throw SyntaxError are permitted to exhibit implementation-defined behaviour instead of throwing SyntaxError when they encounter an implementation-defined extension to the program syntax or regular expression pattern or flag syntax." We can probably debate whether this extension allowance includes or should include JSON.parse. I probably could be convinced that it should not but there seems to be a strong history of tolerance of almost correct inputs by JavaScript implementations so I don't know whether or not we could get consensus on that. _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
Re: JSON parser grammarOn Wed, Jun 3, 2009 at 1:27 AM, Allen Wirfs-Brock
<Allen.Wirfs-Brock@...> wrote: > > The intent was for the ES5 JSON grammar to exactly match the JSON RFC grammar. If you think it is different, then you may have found a bug so let's make sure... It definitely doesn't match, on purpose. For example, the RFC requires JSON strings to represent objects (or arrays) at the root, no primitives allowed. > > Examples, please? The intent is that applying JSON.parse to a string containing a valid JSON form should produce an equivalent set of objects on all conforming ES5 implementation. JSON.parse("[010]") should be an error, per spec. Nobody follows the spec though... -- Robert Sayre "I would have written a shorter letter, but I did not have the time." _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
RE: JSON parser grammar>-----Original Message-----
>From: Robert Sayre [mailto:sayrer@...] >Sent: Tuesday, June 02, 2009 10:33 PM >To: Allen Wirfs-Brock >Cc: Oliver Hunt; Mark S.Miller; Rob Sayre; es-discuss@... >Subject: Re: JSON parser grammar > >On Wed, Jun 3, 2009 at 1:27 AM, Allen Wirfs-Brock ><Allen.Wirfs-Brock@...> wrote: >> >> The intent was for the ES5 JSON grammar to exactly match the JSON RFC >grammar. If you think it is different, then you may have found a bug so >let's make sure... > >It definitely doesn't match, on purpose. For example, the RFC requires >JSON strings to represent objects (or arrays) at the root, no >primitives allowed. You're right, we did intentionally allow top level primitives. > >> >> Examples, please? The intent is that applying JSON.parse to a string >containing a valid JSON form should produce an equivalent set of objects >on all conforming ES5 implementation. > >JSON.parse("[010]") > >should be an error, per spec. Nobody follows the spec though... > As I read them neither the RFC or the current ES5 JSON grammar recognize "[010]" as a valid JSON form, so according to the ES5 spec. a syntax error should be thrown. If we really want all implementation to accept "010" as a JSONNumber then we should specify it as such. Of course we have to define what it means (decimal, octal??). My inclination would be to require ES5 implementation to exactly conform the whatever JSON grammar we provide and to throw syntax errors if the input doesn't exactly conform to the grammar. (in other say that the section 16 extension allowance doesn't apply to JSON.parse. If an implementation wants to support JSON syntax extensions it could always do so by providing a JSON.parseExtended function (or whatever they want to call it) that uses an implementation defined grammar. Allen _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
Re: JSON parser grammarOn Tue, Jun 2, 2009 at 10:56 PM, Allen Wirfs-Brock <Allen.Wirfs-Brock@...> wrote:
+1. -- Cheers, --MarkM _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
|
|
|
Re: JSON parser grammarOn Jun 3, 2009, at 11:12 AM, Oliver Hunt wrote:
>> 1.) leading zeros are parsed as decimal numbers (octal seems like a >> bug no matter what, per MarkM) > IE8 and V8's JSON implementation, and json2.js at json.org all > interpret 010, as octal (eg. 8), and 009 as 9 Those look like bugs ;-). The "noctal" (0377 is 255 but 0800 is 800) in JS since 1995 is surely the original bug, but we're stuck with it to some extent. We shouldn't spread it to JSON. The ES specs mostly try to ignore noctal and hope it goes away, which can be a good strategy if there's a better mousetrap leading developers away from the attractive nuisance. But no one intentionally uses octal or noctal, AFAICT. Only perhaps by accident, and I know of no real-world mistakes of this kind (but I can believe they're out there still). >> 2.) trailing commas in objects and arrays are allowed ({"foo": >> 42,"bar":42,}) > V8's JSON implementation also accepts [1,,,2] Good for it! :-) >> 3.) tabs and linebreaks are allowed in JSON strings (but >> JSON.stringify produces escape sequences, per spec) > My testing shows that only '\' (excluding actual escape sequences) > and '"' are prohibited -- all other values from 0-0xFFFF are allowed. Seems like our bug. /be _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
|
|
|
RE: JSON parser grammarSee below
>-----Original Message----- >From: Oliver Hunt [mailto:oliver@...] ... > >On Jun 2, 2009, at 11:09 PM, Rob Sayre wrote: > >> On 6/3/09 1:56 AM, Allen Wirfs-Brock wrote: ... >> 1.) leading zeros are parsed as decimal numbers (octal seems like a >> bug no matter what, per MarkM) >IE8 and V8's JSON implementation, and json2.js at json.org all >interpret 010, as octal (eg. 8), and 009 as 9 > I'm not sure how you are testing IE8, but in my tests of IE8 JSON.parse('010') yields a syntax error (as currently specified by ES5) while JSON.parse('10') Returns the number 10. json2.js is probably producing the results you see on IE because internally it uses eval and IE supports octal literal with the semantics you observed. Are you sure, you are actually running the native JSON when you seem to see octal being accepted? Native JSON is only enabled if your page is operating in "IE8 standards" mode. >> 2.) trailing commas in objects and arrays are allowed ({"foo": >> 42,"bar":42,}) >V8's JSON implementation also accepts [1,,,2] IE8 syntax errors on both '[1,]' and '[1,,3]' as currently specified by ES5 > >> 3.) tabs and linebreaks are allowed in JSON strings (but >> JSON.stringify produces escape sequences, per spec) >My testing shows that only '\' (excluding actual escape sequences) and >'"' are prohibited -- all other values from 0-0xFFFF are allowed. IE8 allows all control characters except or NUL,LF,CR to appear unescaped in JSON string literals Violates/extends current ES5 spec. As far as I can tell, this is an unintended extension in IE8 that is a result of reusing some parts of the JavaScript lexer for JSON. Allen _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
Re: JSON parser grammarThe V8 implementation is a pretty early implementation and I would
consider all of the issues raised here to be bugs in it. V8 actually just compiles the json as ordinary js: http://www.google.com/codesearch/p?hl=en#W9JxUuHYyMg/trunk/src/json-delay.js&q=ParseJSONUnfiltered&l=30 I'm CCing Christian Plesner Hansen who wrote the JSON.parse method for V8 as well as v8-users. On Wed, Jun 3, 2009 at 11:25, Oliver Hunt <oliver@...> wrote: > > On Jun 3, 2009, at 11:18 AM, Rob Sayre wrote: > >> On 6/3/09 2:12 PM, Oliver Hunt wrote: >>>> >>>> 1.) leading zeros are parsed as decimal numbers (octal seems like a bug >>>> no matter what, per MarkM) >>> >>> IE8 and V8's JSON implementation, and json2.js at json.org all interpret >>> 010, as octal (eg. 8), and 009 as 9 >> >> Yes, I understand. Do you see why strict mode makes this behavior >> undesirable? > > I'm not saying it makes the behaviour desirable, i'm commenting on the fact > that any time implementations have been lax eventually all implementations > become lax. I for one welcome our octal-free overlords ;) > >>>> 2.) trailing commas in objects and arrays are allowed >>>> ({"foo":42,"bar":42,}) >>> >>> V8's JSON implementation also accepts [1,,,2] >> >> What does it produce? An array with holes, or an array with null members? > > An array with holes -- in so far as i can tell V8's json object exactly > matches the result of eval(string), just prohibiting arbitrary code > execution. > >> - Rob > > --Oliver > > _______________________________________________ > es-discuss mailing list > es-discuss@... > https://mail.mozilla.org/listinfo/es-discuss > -- erik _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
Re: RE: JSON parser grammarAllen Wirfs-Brock wrote:
>> JSON.parse("[010]") >> >> should be an error, per spec. Nobody follows the spec though... >> > > As I read them neither the RFC or the current ES5 JSON grammar recognize "[010]" as a valid JSON form, so according to the ES5 spec. a syntax error should be thrown. If we really want all implementation to accept "010" as a JSONNumber then we should specify it as such. Of course we have to define what it means (decimal, octal??). > > My inclination would be to require ES5 implementation to exactly conform the whatever JSON grammar we provide and to throw syntax errors if the input doesn't exactly conform to the grammar. (in other say that the section 16 extension allowance doesn't apply to JSON.parse. If an implementation wants to support JSON syntax extensions it could always do so by providing a JSON.parseExtended function (or whatever they want to call it) that uses an implementation defined grammar. I agree. It is not helpful to developers to allow weird forms on browser A but not on browser B. What should be allowed is clearly described in the E5 spec. _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
|
|
|
Re: JSON parser grammar > 2) Do we want to permit conforming implementations to extend the JSON
> grammar that they recognize? No. An implementation has the license to support other formats (such as an XML object or a SuperJSON object). But the JSON object should recognize only the JSON forms described by ES5. There should be no Chapter 16 squishiness here. > 3) If we disallow JSON grammar extensions (for JSON.parse) should we extend > the existing grammar with some Postel's Law flexibility? > a) Allow strings, numbers, Booleans, and null in addition to objects and > arrays as top level JSON text. Yes. This turns out to be very useful. > b) Permit leading zeros on numbers either with or without octal implications. No. Clearly we don't want octal. Allowing octally forms invites confusion. > Does anyone know of any encoders or uses that actually insert leading 0's? I do not know of any. If they did exist, they would be in violation of the JSON rules. > c) Trailing commas in objects and arrays This is a hazard for hand coding, just as with object literals. JSON was intended for machine-to-machine communication, so I prefer to not allow extra commas. > d) Holes in arrays, eg [1,,3] No holes. > e) Allow some/all control characters to appear unescaped in JSON string > literals. Which ones? No. Keep it simple. > Are there known encoders that pass through such characters without escaping them? Not that I know of. Again, would be in violation. f) Allow single quotes within JSON text as string delimiters No. _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
Re: JSON parser grammarThe JSON RFC, by including the escape clause "A JSON parser MAY accept
non-JSON forms or extensions", admits non-validating parsers. The table at <http://code.google.com/p/json-sans-eval/> gives us some good terminology. The reason we need JSON to be provided by platforms rather than libraries is that we desire JSON parsers that are simultaneously fast, secure, and validating. Unfortunately, it also specifies only (<object> | <array>) as a valid start symbol for parsing. On Wed, Jun 3, 2009 at 12:59 PM, Douglas Crockford <douglas@...> wrote: >> 2) Do we want to permit conforming implementations to extend the JSON >> grammar that they recognize? > > No. An implementation has the license to support other formats (such as an > XML object or a SuperJSON object). But the JSON object should recognize only > the JSON forms described by ES5. There should be no Chapter 16 squishiness > here. Crock, is your position that ES5 should specify a validating JSON parse exactly equivalent to the parse specified in the RFC (i.e., waiving the escape clause), but with JSON <value> as the start symbol? If so, then I agree. Are there *any* other differences between the RFC and ES5 besides the start symbol and the RFC's escape clause? If so, can we repair all of them? Has anyone tested the annoying \u2028 and \u2029 issue on current implementations? I'd guess there are currently bugs here as well. -- Cheers, --MarkM _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
Re: JSON parser grammarMark S. Miller wrote:
> Crock, is your position that ES5 should specify a validating JSON > parse exactly equivalent to the parse specified in the RFC (i.e., > waiving the escape clause), but with JSON <value> as the start symbol? > If so, then I agree. Yes. _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
Re: JSON parser grammarMark S. Miller wrote:
> Crock, is your position that ES5 should specify a validating JSON > parse exactly equivalent to the parse specified in the RFC (i.e., > waiving the escape clause), but with JSON <value> as the start symbol? > If so, then I agree. Yes. Then we are in agreement. _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
Re: JSON parser grammarOn Wed, Jun 3, 2009 at 4:59 PM, Douglas Crockford <douglas@...> wrote:
> Mark S. Miller wrote: >> Crock, is your position that ES5 should specify a validating JSON >> parse exactly equivalent to the parse specified in the RFC (i.e., >> waiving the escape clause), but with JSON <value> as the start symbol? >> If so, then I agree. > > Yes. Then we are in agreement. OK, so, all such deviations will be considered bugs by implementations that purport to conform. Right? -- Robert Sayre "I would have written a shorter letter, but I did not have the time." _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
|
|
Re: JSON parser grammarOn Wed, Jun 3, 2009 at 2:10 PM, Robert Sayre <sayrer@...> wrote:
> OK, so, all such deviations will be considered bugs by implementations > that purport to conform. Right? Yes. -- Cheers, --MarkM _______________________________________________ es-discuss mailing list es-discuss@... https://mail.mozilla.org/listinfo/es-discuss |
| < Prev | 1 - 2 - 3 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |