|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 | Next > |
|
|
json_to_term EEPRichard,
Thanks again for your work on the EEP. I've been communicating with Damien (CouchDB lead) about the shape of objects as returned by json_to_term(). We think that returning a list of tuples is preferable to returning a tuple of tuples. Starting with a JSON object like: {"key":"value", "key2":"value2"} the two options in Erlang are: Tuple of tuples (A): {{<<"key">>, <<"value">>},{<<"key2">>, <<"value2">>}} or Tuple containing a list of tuples (B): {[{<<"key">>, <<"value">>},{<<"key2">>, <<"value2">>}]} We both have a preference for (B - list of tuples) because based on current usage in CouchDB, (A - raw tuples) would have us calling tuple_to_list() constantly when we need to interact with the data. I don't see any big drawbacks to (B) and the ease-of-use argument is important. Requiring less code for the most common use-cases is a big win. Chris -- Chris Anderson http://jchris.mfdz.com _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPOn Jul 28, 2008, at 2:42 PM, Chris Anderson wrote: > Richard, > > Thanks again for your work on the EEP. > > I've been communicating with Damien (CouchDB lead) about the shape of > objects as returned by json_to_term(). We think that returning a list > of tuples is preferable to returning a tuple of tuples. > > Starting with a JSON object like: > > {"key":"value", "key2":"value2"} > > the two options in Erlang are: > > Tuple of tuples (A): {{<<"key">>, <<"value">>},{<<"key2">>, > <<"value2">>}} > > or > > Tuple containing a list of tuples (B): {[{<<"key">>, > <<"value">>},{<<"key2">>, <<"value2">>}]} > > We both have a preference for (B - list of tuples) because based on > current usage in CouchDB, (A - raw tuples) would have us calling > tuple_to_list() constantly when we need to interact with the data. I > don't see any big drawbacks to (B) and the ease-of-use argument is > important. Requiring less code for the most common use-cases is a big > win. Wouldn't B also allow us to access the data as a proplist? If so, that seems like another reason to vote for B. --Kevin > > > Chris > > -- > Chris Anderson > http://jchris.mfdz.com > _______________________________________________ > erlang-questions mailing list > erlang-questions@... > http://www.erlang.org/mailman/listinfo/erlang-questions _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPOn Mon, Jul 28, 2008 at 11:42 AM, Chris Anderson <jchris@...> wrote:
> Richard, > > Thanks again for your work on the EEP. > > I've been communicating with Damien (CouchDB lead) about the shape of > objects as returned by json_to_term(). We think that returning a list > of tuples is preferable to returning a tuple of tuples. > > Starting with a JSON object like: > > {"key":"value", "key2":"value2"} > > the two options in Erlang are: > > Tuple of tuples (A): {{<<"key">>, <<"value">>},{<<"key2">>, <<"value2">>}} > > or > > Tuple containing a list of tuples (B): {[{<<"key">>, > <<"value">>},{<<"key2">>, <<"value2">>}]} > > We both have a preference for (B - list of tuples) because based on > current usage in CouchDB, (A - raw tuples) would have us calling > tuple_to_list() constantly when we need to interact with the data. I > don't see any big drawbacks to (B) and the ease-of-use argument is > important. Requiring less code for the most common use-cases is a big > win. I have a strong preference for B. I don't think I would use it if it was implemented as a tuple of tuples just because it wouldn't be very convenient. -bob _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPHi,
Chris Anderson wrote: > the two options in Erlang are: > > Tuple of tuples (A): {{<<"key">>, <<"value">>},{<<"key2">>, <<"value2">>}} > > or > > Tuple containing a list of tuples (B): {[{<<"key">>, > <<"value">>},{<<"key2">>, <<"value2">>}]} I think there is no doubt that lists will be more useful than tuples. There is, however another option, that I have been using in a json parser I wrote: (C) an object is simply a proplist, i.e. a list of tuples. This is what one really wants to have in erlang. The difference to option (B) is that while if a single object is decoded it is easy to discard the outer {}, when objects are used inside other structures that is not the case anymore, and (C) will result in a greater chance of allowing a decoded structure to be stored as is with no post-processing in a useful erlang structure. The only problem (C) poses is distinguishing the empty object from an empty array. My solution (which I am almost happy about) is to represent the empty object as [{}]. This way: - objects can be distinguished from arrays, e.g. by the following function: is_object(O=[T|_]) when is_tuple(T) -> true; is_object(_) -> false. - we can use objects as proplists, use functions like lists:keysearch or list comprehensions like Keys = [V || {V,_} <- Object] which will work even for the special empty object [{}]. Anyway, the empty object is not a common case (at least for my purposes), and the advantages of being able to store nested objects in the most pleasant way is something that should make one consider option (C). As others have said, I also do not consider option (A) useful. Regards, Paulo P.S. Given the sudden interest in json, I will describe the options I took in my parser and make it available in a subsequent post, to further discussion. _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPHere's an example of (C):
>From the JSON: {"key":"value", "key2":"value2"} (C - object as proplist): [{<<"key">>,<<"value">>}, {<<"key2">>, <<"value2">>}] On Mon, Jul 28, 2008 at 2:51 PM, Paulo Sérgio Almeida <psa@...> wrote: > > The only problem (C) poses is distinguishing the empty object from an empty > array. My solution (which I am almost happy about) is to represent the empty > object as [{}]. This way: > > - objects can be distinguished from arrays, e.g. by the following function: > > is_object(O=[T|_]) when is_tuple(T) -> true; > is_object(_) -> false. > > - we can use objects as proplists, use functions like lists:keysearch or > list comprehensions like > > Keys = [V || {V,_} <- Object] > > which will work even for the special empty object [{}]. (C) seems promising to me now that I've convinced myself that there aren't issues with nesting. Thanks for the input, Paulo! -- Chris Anderson http://jchris.mfdz.com _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPOn 29 Jul 2008, at 9:51 am, Paulo Sérgio Almeida wrote:
> I think there is no doubt that lists will be more useful than > tuples. There is, however another option, that I have been using in > a json parser I wrote: > > (C) an object is simply a proplist, i.e. a list of tuples. This is in fact what I originally proposed, the tricky point being that {} is a legal empty object in JSON, and we can't map that to [] because that's the representation for the empty sequence []. (O) Original proposal: {} => {}, other objects => list of pairs (A) Armstrong version: object => tuple of pairs, no exceptions. (B) Object => {list of pairs}. (C) Almeida proposal: as (O) but {} => [{}]. The arguments for usability of the result in Erlang are the arguments that originally had me proposing (O). However, I note that nothing stops us providing a range of handy-dandy functions that work on tuples of pairs. %(O) is_object({}) -> true; is_object([{_,_}|_]) -> true; is_object(_) -> false. %(A) is_object(T) -> is_tuple(T). %(B) is_object({T}) -> is_list(T). %(C) is_object([T|_]) -> is_tuple(T); is_object(_) -> false. It's rather annoying to be so bothered about empty objects; do they occur in practical JSON? Proposal (C) seems neat enough; the main problem is fitting the results with @type. -- If stupidity were a crime, who'd 'scape hanging? _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPHi all,
How about a SAX-like API? See for example http://www.p6r.com/articles/2008/05/22/a-sax-like-parser-for-json/. I can imagine that it would be easy to create any of the forms proposed in this thread based on such an API. On the other hand it would allow you to do things that you wouldn't be able to do with a parser that produces a complete representation at once (in particular: parsing very big documents), and it would be better suitedt to support a 'data mapper' approach like the Erlang ASN.1 implementation, Googles Protocol Buffers or erlsom.
Regards,
Willem
On 7/29/08, Richard A. O'Keefe <ok@...> wrote:
On 29 Jul 2008, at 9:51 am, Paulo Sérgio Almeida wrote: _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPOn Tue, Jul 29, 2008 at 3:13 AM, Richard A. O'Keefe <ok@...> wrote:
is_object({T}) -> is_list(T); is_object(_) -> false. % avoid exception
(C) seems good for me too, because proplist works fine with it. > proplists:get_bool(a, [{}]). false > proplists:get_bool(a, [{a, true}]). true > proplists:get_value(a, [{a, true}]). true > proplists:get_value(a, [{a, heh}]). heh > proplists:get_value(a, [{}]). undefined atom is used only for simplicity, but works with binaries too. (JSON's boolean should be true/false atom of course I assume.)
-- --Hynek (Pichi) Vychodil _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPYou will have to forgive but I am now going to do something which I hate when others do it: comment without really knowing much about the topic. :-)
Why not just use option (B) and have the empty object as {[]}? It is always consistent and the empty object is easily from the empty list and empty string. I don't see having the extra tuple should cause any problems, but then again I am no expert.
I would prefer to always have strings in *one* format and not special case keys with atoms sometimes. Otherwise to be certain you would have to match both atom and binary to find key. Unless you *always* use atoms for keys, which could easily explode.
Robert 2008/7/29 Hynek Vychodil <vychodil.hynek@...>
_______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPOn Tue, Jul 29, 2008 at 4:07 PM, Robert Virding <rvirding@...> wrote:
*I am no expert.* You are joking. So on topic: JSON: {"key":"value", "key2":{}, "key3":[{}, 3.14 , "val", true], "key4": {"a":false, "b":2} } (B): {[ {<<"key">>, <<"value">>}, {<<"key2">>, {[]}}, {<<"key3", [{[]}, 3.14, <<"val">>, true]}, {<<"key4">>, {[{<<"a">>, false},{<<"b">>, 2}]}} ]} (C): [ {<<"key">>, <<"value">>}, {<<"key2">>, [{}]}, {<<"key3", [[{}], 3.14, <<"val">>, true]}, {<<"key4">>, [{<<"a">>, false},{<<"b">>, 2}]} ] (One can use it as simple test case ;-) ) I don't know why (B) version should be better than (C). It's true that (B) have minimal overhead and (C) have a little bit (a really little) more complicate object detection, but in both variants object and list can be determined exactly and in both in function/case guard expression. Notice key2, key3 and key4 values. Result: (B) - one structure level for each object more - no problem in Erlang (C) - first element type check "more" - no problem in Erlang It's fifty fifty in technically manner and only personal preference rules. (One more structure level is worse in my feeling, but ...)
I argue unification, so transforming all to atom is insecure and result is don't use this way at all. Aside non-uniformity of list_to_existing_atom way, there is performance drawback too. For each key you must call list_to_existing_atom(binary_to_list(X)) and binary_to_list causes GC pressure in this usage. I would not have use this variant, too. All is binary is best for me. P.S.: Why non-uniform is problem. One can argue, it looks nicer. OK. One can argue, binary->atom transformation is done only for exists atoms and all atoms which used in comparisons are exists. BAD, imagine for example store Erlang term for long time or send to other nodes ... It *can* complicate think, so avoid it if you can and we *can*. I think, it is dangerous.
-- --Hynek (Pichi) Vychodil _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPI find this discussion very interesting. Thanks to everyone who has spoken up.
2008/7/28 Willem de Jong <w.a.de.jong@...>: > How about a SAX-like API? See for > example http://www.p6r.com/articles/2008/05/22/a-sax-like-parser-for-json/ CouchDB will definitely need a streaming JSON processor if we are to handle giant documents without building them in memory. The example SAX/JSON parser in C++ is a good read, it's making me want to prototype something like that in Ruby. A SAX-like streaming tokenizer seems like it could lend itself to a nice, lean implementation. On the question of formats, I think any of the proplist formats would be a good choice. Here's a look at is_array() for the proplist options. %(O) is_array([{_,_}|_]) -> false; is_array(T) -> is_list(T). %(B) is_array(T) -> is_list(T). %(C) is_array([T|_]) -> not is_tuple(T); is_array(T) -> is_list(T). (B) has the simplest array/object test-functions and has the parsing/writing advantage that it doesn't require you to look inside each Erlang list, to see if it corresponds to a JSON array or object. This means reading left-to-right you know immediately when you've encountered a JSON array or object. I'm not sure how heavy to weight the easy-to-read (especially as some people could think of the {[]} format as harder to read due to the extra {}. Chris -- Chris Anderson http://jchris.mfdz.com _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPOn 29 Jul 2008, at 6:10 pm, Willem de Jong wrote:
> How about a SAX-like API? (1) Anyone who wants such a design can produce their own design, AND their own code. The EEP I am concerned with is a DVM- like design (Document *Value* Model). (2) In the XML world, there are several reasons for being interested in SAX-like designs (why the H*LL they could not bring themselves to say ESIS-like, when ESIS was the traditional SGML model for the event stream, I cannot imagine, unless it was sheer NIH). (A) You can start processing a document without waiting for the end. If people have JSON applications where they need to start, say, processing the properties of an "object" before knowing what other properties it may have, then such a design may be useful for them. See JSON-RPC note below. (B) You can process a HUGE document without having to hold all of it in memory. This was a major issue back in the days of 16-bit machines; one of the merits of Troff was that it produced pages "on-line", and pipelines involving SGML and Troff (or similar) made sense. These days, there are some amazingly large RDF files around, so again, not having to hold the hold thing makes sense. If people have JSON applications where they want to send 100s of MB of data as JSON, such a design may be useful for them. The 'man' documentation kit on Solaris works in very much this way: SGML documentation => events => hacky program that converts element edges to Troff macros => Troff. (C) You may be able to filter an event stream so as to yield the effect of selecting (or removing) elements. I've done more of this than I care to remember piping the output of nsgmls (or of the SWI Prolog SGML parser) through AWK scripts. Think "subset of XPath" and you'll get the idea. This is really a special case of (A) and (B). People who have a need for filtering lengthy JSON streams and want to reduce latency could use such a design. (3) In the functional programming world, SAX is less attractive, because the usual techniques for using an ESIS/SAX-like interface are heavily stateful. Once I had my Document Value Model kit, I found doing things the "functional" way over documents as trees was so much easier than doing things the ESIS/SAX-like way that now work with entire forms whenever I can, and this is *C* programming I'm talking about, where stateful is supposed to be easy. (4) The JSON RFC makes it clear that JSON "messages", if I may call them that, may only be "arrays" or "objects"; a number or a string must be inside something else. In cases where an ESIS/ SAX-like interface might have made sense, it would be more usual using JSON to send a stream of self-contained forms that can be easily processed one at a time as entire things. (5) The JSON-RPC 1.1 draft (I haven't looked at 1.0) hints at some kind of ESIS/SAX-like interface when it says that arguments should be sent in such an order that the receiver can process them when it gets them. How are people actually using JSON-RPC? Is there that much to gain, in actual practice? (6) Not on topic, but I can't help feeling that Linux D-Bus would be nicer if it used JSON... > See for example http://www.p6r.com/articles/2008/05/22/a-sax-like-parser-for-json/ > . I can imagine that it would be easy to create any of the forms > proposed in this thread based on such an API. The thing is, it wouldn't be NEARLY as easy as NOT using such an API. Several Erlang JSON implementations have been mentioned or displayed in this thread already. They are not particularly hard to write. I'd say they are MUCH harder to design than to write! And the ones I have read would definitely have been *harder* to code using an ESIS/ SAX- like interface. > On the other hand it would allow you to do things that you wouldn't > be able to do with a parser that produces a complete representation > at once (in particular: parsing very big documents), and it would be > better suitedt to support a 'data mapper' approach like the Erlang > ASN.1 implementation, Googles Protocol Buffers or erlsom. The question is whether the things that an ESIS/SAX-like interface let you do are things that people particularly *want* to do with JSON. I have no idea. The world has room for both "value" interfaces and "event stream" interfaces. Obviously an ESIS-like interface is possible because we can trivially map JSON to XML: number => <number value="numeric string"/> string => <string value="string"/> array => <array>e1...en</array> object => <object><slot name="n1">e1</slot>...</object> So a JSON parser could simply emit the same event stream (using *precisely* a SAX interface) as an XML parser *would* have emitted given the equivalent XML. That is, you would not have a new *interface*, just a new *parser* that reused your existing "SAX" interface. _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPIt would be nice if people would read the EEP.
On 30 Jul 2008, at 2:55 am, Hynek Vychodil wrote: > I would prefer to always have strings in *one* format and not > special case keys with atoms sometimes. Otherwise to be certain you > would have to match both atom and binary to find key. Unless you > *always* use atoms for keys, which could easily explode. In the EEP, json_to_term(IO_Data, Options) has an option {label,binary} or {label,atom} or {label,existing_atom} There is no corresponding option for strings, which are always binaries. (The idea is that strings are unpredictable data, whereas labels are predictable structure.) {label,binary} says to leave all labels as binaries. This would have been intolerable before <<"...">> syntax was introduced; now the main thing is that it wastes space. {label,atom} says to convert to an atom any label that CAN be converted to an atom, the main limitation being that Erlang atoms are not yet Unicode-ready. (Someone else has an EEP about that, I believe.) This is perfect for communicating with a TRUSTED source, just like receiving Erlang term_to_binary() values and decoding them. {label,existing_atom} means that a module that mentions certain atoms in pattern matches against formerly-JSON labels can be confident of finding those atoms, while other labels may remain binaries. Options are a way of coping with different people's different situations and needs; the trick is to have just enough of them. > I argue unification, Unification of what with what? > so transforming all to atom is insecure and result is don't use this > way at all. WITHIN a trust boundary, all is well. Not all communication crosses trust boundaries, otherwise term_to_binary() would be of little or no use. > > Aside non-uniformity of list_to_existing_atom way, there is > performance drawback too. For each key you must call > list_to_existing_atom(binary_to_list(X)) and binary_to_list causes > GC pressure in this usage. I would not have use this variant, too. What performance drawback? What call to binary_to_list()? Whoever said the binary EXISTED in the first place? The EEP is a proposal for putting these conversion functions in the Erlang core, eventually to be implemented in C. So implemented, the alleged performance drawback simply does not exist. > > P.S.: Why non-uniform is problem. It is a problem for people who EXPECT a uniform translation, and not for people who don't. > One can argue, it looks nicer. OK. One can argue, binary->atom > transformation is done only for exists atoms and all atoms which > used in comparisons are exists. BAD, imagine for example store > Erlang term for long time or send to other nodes Again, you are overlooking the fact that different people have different needs, and that the translation of labels can be (and IS, in the EEP) an OPTION. You are also overlooking the fact that *considered as JSON*, the forms are entirely equivalent, and that since JSON explicitly says that the order of key:value pairs does not matter, there is uncertainty about precisely what Erlang term you get anyway. In fact, for binary storage, conversion to existing atoms is *better* than conversion to binaries, because the Erlang term-to-binary format uses a compression scheme for atoms that it does not use for binaries. Admittedlty, the answer to that is to extend the compression scheme to binaries as well. _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPIn message <5B5B0B86-D10A-4B88-AA77-BD13381E1393@...>
Richard O'Keefe writes: >On 29 Jul 2008, at 6:10 pm, Willem de Jong wrote: >> How about a SAX-like API? > >(1) Anyone who wants such a design can produce their own design, > AND their own code. The EEP I am concerned with is a DVM- > like design (Document *Value* Model). Note: if anyone dislikes DVM because of difficulties in editing large values - have a look at "zippers". >(5) The JSON-RPC 1.1 draft (I haven't looked at 1.0) hints at some > kind of ESIS/SAX-like interface when it says that arguments > should be sent in such an order that the receiver can process > them when it gets them. How are people actually using JSON-RPC? > Is there that much to gain, in actual practice? I've used only JSON-RPC 1.0, which (as gratuitous exposition) was essentially just: requests are JSON objects with the fields: - id: (term) a value to associate request and response - method: (string) the name of the procedure being called - args: (array) the arguments responses are JSON ojects with the fields: - id: (term) to associate the response with the request - result: (term) the result of the procedure application - error: (term) an exceptional result - exactly one of result or error will be null JSON-RPC could be layered directly over TCP, or any other bytestream transport. This means that the JSON parser is required to do proper framing - to be able to handle too much or too little input. This motivated my request for a continuation-based parser interface in my feedback to the original EEP draft. The direct layering of JSON-RPC over a stream transport allowed for out-of-order responses over a single connection. For reasonably-sized requests and responses, this was almost as good as having channels within the connection, as BEEP has. Sadly, JSON-RPC 1.1 looks like it is only layered on top of HTTP, losing this feature. In answer to your question, I've used JSON-RPC (1.0) for a production service, and I've been just fine with a value model for the parsed results. I kept the size of JSON terms small by design: if the parsed terms were too big to conveniently handle as Erlang values, they would have been clogging the transport too much. Jim _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPIn message <38C632F4-991C-4F8D-8694-8DE1066385FC@...>
Richard A. O'Keefe writes: >On 30 Jul 2008, at 2:55 am, Hynek Vychodil wrote: >> Aside non-uniformity of list_to_existing_atom way, there is >> performance drawback too. For each key you must call >> list_to_existing_atom(binary_to_list(X)) and binary_to_list causes >> GC pressure in this usage. I would not have use this variant, too. > >What performance drawback? What call to binary_to_list()? Whoever said >the binary EXISTED in the first place? The EEP is a proposal for >putting >these conversion functions in the Erlang core, eventually to be >implemented in C. So implemented, the alleged performance drawback >simply >does not exist. I may have been the source of the confusion here. I mentioned list_to_existing_atom/1 in my feedback to Richard's original draft. I mentioned it only to a) point to existing semantics, and b) suggest that the proposed parser interface would allows a pure erlang implementation in addition to being built in to the runtime, though I was not explicit about either reason. Jim _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPOn Wed, 30 Jul 2008 12:55:27 am Hynek Vychodil wrote:
> JSON: {"key":"value", "key2":{}, "key3":[{}, 3.14 , "val", true], "key4": > {"a":false, "b":2} } > > (B): {[ > {<<"key">>, <<"value">>}, > {<<"key2">>, {[]}}, > {<<"key3", [{[]}, 3.14, <<"val">>, true]}, > {<<"key4">>, {[{<<"a">>, false},{<<"b">>, 2}]}} > ]} How about {json, [ {...} ] } so that we know what we are looking at and can check it in function argument patterns etc. -- Anthony Shipman Mamas don't let your babies als@... grow up to be outsourced. _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPOn Wed, Jul 30, 2008 at 3:34 AM, Richard A. O'Keefe <ok@...> wrote: You are overlooking the fact, that there are another scenarios. For example:It would be nice if people would read the EEP. All JSON data coming outside Erlang are binary in first state, there is no Erlang lists outside Erlang.
1/ Read and parse JSON {"a":1, "b":2, "c":3} on one erlang node with one set of existing atoms (a,b). 2/ Store Erlang term to file [{a,1}, {b,2}, {<<"c">>, 3}] 3/ In another erlang node with existing atom list {a,c} (for examle in some module you want detect c key of data take from JSON) you load and parse same JSON {"a":1, "b":2, "c":3} and from parser you get [{a,1}, {<<"b">>,2}, {c, 3}] 4/ Than you load stored erlang term from file and two think happend. You take [{a,1}, {b,2}, {<<"c">>, 3}] and existing atoms are now {a,b,c}. 5/ Read and poarse JSON {"a":1, "b":2, "c":3} again and you take [{a,1}, {b,2}, {c, 3}] 6/ Great, you have terms [{a,1}, {b,2}, {c, 3}], [{a,1}, {b,2}, {<<"c">>, 3}] and [{a,1}, {<<"b">>,2}, {c, 3}] as Erlang term representing same JSON input {"a":1, "b":2, "c":3}. What the hell, there is some totaly wrong, isn't it? Erlang is way how to make things safe and reliable. Converting keys to atoms is not safe and reliable so don't do it, It hurts you! -- --Hynek (Pichi) Vychodil _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPAnthony Shipman wrote:
> How about > {json, [ {...} ] } > so that we know what we are looking at and can check it in function argument > patterns etc. rfc4627.erl uses {obj, [{Key, Value}, ...]}. Personally, I'm in favour of the uniform option {[{Key, Value}, ...]}, with the empty object being {[]}. It permits uniform treatment of the list of key-value pairs without a gratuitous special case. I find myself reading it as if JSON objects are delimited by a new kind of brackets, "{[" and "]}". Tony _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPOn 30 Jul 2008, at 10:07 pm, Hynek Vychodil wrote: [it was rather hard to figure out what was just quoting and what was actual response] > What performance drawback? What call to binary_to_list()? Whoever > said > the binary EXISTED in the first place? The EEP is a proposal for > putting > these conversion functions in the Erlang core, eventually to be > implemented in C. So implemented, the alleged performance drawback > simply > does not exist. > > All JSON data coming outside Erlang are binary in first state, > there is no Erlang lists outside Erlang. True and irrelevant: the ONLY lists that json_to_term/[1,2] should construct are the ones in the results. NO list construction whatsoever is implied in the handling of strings. Remember, this is an EEP based on Joe Armstrong's suggestion that there should be new built in functions! > You are overlooking the fact, that there are another scenarios. ABSOLUTELY NOT! Remember, options are OPTIONS. > For example: > > 1/ Read and parse JSON {"a":1, "b":2, "c":3} on one erlang node with > one set of existing atoms (a,b). > > 2/ Store Erlang term to file [{a,1}, {b,2}, {<<"c">>, 3}] Remember, this does *NOT* happen by default. For labels to be converted to existing atoms, the programmer HAS TO ASK FOR IT EXPLICITLY. You are 100% right that the DEFAULT options should be safe. However, the real danger here has nothing to do with atoms. The danger is this: if you want to store JSON data, you should store it *as* JSON, not as something else. (I am counting compressed JSON as JSON here.) The EEP points out other ways in which Erlang-encoded-JSON may vary: numbers might be integers or floats, {key,value} pairs may be reordered in many ways. Nor does this have anything to do with Erlang specifically. For ALL languages, if you want to store JSON or transmit it or in any way cause JSON data known to one node to become known to another node you should store or transmit it *AS* (possibly compressed) JSON, not as something else. Anyone who keeps this straight will not run into trouble. > > 3/ In another erlang node with existing atom list {a,c} (for examle > in some module you want detect c key of data take from JSON) you > load and parse same JSON {"a":1, "b":2, "c":3} and from parser you > get [{a,1}, {<<"b">>,2}, {c, 3}] Remember, {label,existing_atom} is meant for a module that wants to receive a JSON term and process it, looking for keys that are mentioned in that module. If an Erlang process holds a JSON term in Erlang form and wants to pass it to another node or another time, it should send it AS JSON. > 4/ Than you load stored erlang term from file and two think happend. > You take [{a,1}, {b,2}, {<<"c">>, 3}] and existing atoms are now > {a,b,c}. > > 5/ Read and poarse JSON {"a":1, "b":2, "c":3} again and you take [{a, > 1}, {b,2}, {c, 3}] > > 6/ Great, you have terms [{a,1}, {b,2}, {c, 3}], [{a,1}, {b,2}, > {<<"c">>, 3}] and [{a,1}, {<<"b">>,2}, {c, 3}] as Erlang term > representing same JSON input {"a":1, "b":2, "c":3}. What the hell, > there is some totaly wrong, isn't it? Yes, and what is wrong is seriously incompetent programming. There are other round trip issues, including the handling of numbers, and the order of {key,value} pairs. Recall that the default is {label,binary}. So in one node we convert a JSON form to an Erlang term. Another node does the same. One of the nodes then sends its term to the other, which compares the two terms. Are they guaranteed to be the same? Nope. [hint FOR ANY PROGRAMMING LANGUAGE AND LIBRARY THE ONLY COMPLETELY RELIABLE WAY TO TRANSMIT JSON DATA IS *AS* *JSON*. Got that? {label,existing_atom} simply is not meant for the use case you present. > > > Erlang is way how to make things safe and reliable. Converting keys > to atoms is not safe and reliable so don't do it, It hurts you! No, it only hurts stupid people. Converting keys to existing atoms is perfectly safe for SOME uses, and there seems to be no good reason to forbid letting people do that when they are willing to take responsibility for it being safe. Expecting JSON forms to convert to identical Erlang terms at all times and in all places, now THAT is not safe and not reliable and WILL hurt you. I could make similar remarks about any language, and about many formats including XML. _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: json_to_term EEPOn 7/30/08, Richard A. O'Keefe <ok@...> wrote:
On 29 Jul 2008, at 6:10 pm, Willem de Jong wrote: Of course, but if the Erlang team creates special, fast support in C it
would be good if it could be used by as many people as possible. (3) In the functional programming world, SAX is less attractive, I personally like working with a SAX parser. See the example below - I quite enjoyed writing it.
The question is whether the things that an ESIS/SAX-like interface The point is, that the Erlang team would probably like to implement only 1 very
fast JSON parser in C. In my opinion, that should be a SAX-like parser, because it is easy to create DVM output based on SAX output, but pointless to do it the other way around. To give an example: A sax parser may create the following events (that is: call its callback E = [startDocument,startObject, {key,"menu"}, startObject, {key,"id"}, (This corresponds to a slightly shortened version of the second example found on json.org). Below an example of a callback function to process these events - this function would be called by the SAX parser when it has processed another relevant part of the JSON document. The parser passes the value dvm(startDocument, _) -> With the events given above this gives the following output: {[{"menu", Regards, Willem _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
| < Prev | 1 - 2 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |