Three Undocumented Features of JSON

Update: This article was written as an April Fool’s joke, taking a stab at the fact that JSON is a closed, dumb, and pretty anemic data format. There are other interesting data formats available, such as extensible data notation (edn).

JSON (JavaScript Object Notation) is widely known as JavaScript’s data format, often used as a language-independent data interchange format. However, what I consider to be the three most powerful features of JSON are little-known. The main reason for this is probably that they are pretty much undocumented, and therefore not widely used. This article will explain these features. It will show that JSON in itself, without any of the complicated hacks in JavaScript, is in fact a powerful, extensible and dynamic programming language, and not just a dumb, closed data format containing the least common denominator of all C-based programming languages. Since most developers have a limited view of what JSON is, I will call JSON with these three features JSON++.

Every developer knows that JSON has literal support for null, boolean, number, string, array and object. A JSON array is delimited with square brackets, [], and contains zero or more comma-separated elements:

A JSON object is known in other languages as a map, associative array, dictionary, or hash. It’s delimited by curly braces, {}, and contains zero or more comma-delimited key-value pairs, where the key and the value is delimited by a colon, :. Keys must be strings:

The List

What few developers know about JSON is that there is literal support for a third data structure, namely the list. It’s not documented on the official JSON page, and it’s not widely used, probably because most developers that do know about it, believe that arrays are sufficient for their needs. A list is delimited by parentheses, (), and contains zero or more comma-separated elements:

Unless quoted (more about that in the next section), a list is treated specially by the JavaScript interpreter. Its first element will be called as a function, with the remaining elements as its arguments:

This is the key feature that turns JSON into a powerful dynamic programming language with a uniform, simple prefix-based syntax. Of course, in order for this to work, there must be something that resolves the name square to a function. That’s the second undocumented feature, described in the following section.

The Symbol

Most developers have been trained that, in JSON, if a value is not a boolean, number or null, it should be a quoted string. This is not entirely true. There is another type of value: the symbol. Symbols begin with a non-numeric character and can contain alphanumeric characters and *, +, !, -, _, and ?. The interpreter will try to resolve any symbols it finds to their values, unless instructed to suppress evaluation.

Initially, there are almost no symbols defined, which may explain why most developers are unaware of them. Apart from the built-in operators such as +, *, etc, only a handful of special forms are pre-defined, but they really pack a punch:

  • quote
  • function
  • if
  • define

quote

The quote form simply suppresses the evaluation of its arguments. If indeed a list is needed, and not a function call, a list can be quoted:

The quote form needs to be special since it must suppress evaluation of everything.

function

The function form defines anonymous functions. It takes an optional name symbol, an argument list in the form of an array, and zero or more lists as the body of the function. Functions are first-class objects, making JSON a functional programming language. This list defines an anonymous function that squares its argument:

If a name symbol is provided, it will be bound to the function object itself, allowing for self-calling, even in anonymous functions. This list defines an anonymous factorial function:

The function form needs to be special, since it must suppress evaluation of the name symbol and the argument symbols.

if

The if form is the way to get conditional execution in JSON. It takes two or three arguments: a test expression, a then expression, and an optional else expression. Anything not false or null will be regarded as true. This list will return the string “it was true”, since true is, well, not false:

The if form must be special since it can’t evaluate all of its arguments. It needs to first evaluate the test expression, and, depending on the result, evaluate either the then expression or the else expression. If there is no else expression and the test expression evaluates to false (or null), null is returned.

define

The define special form is used to bind expressions to symbols. This list binds the number 3 to the symbol x:

This list defines the name square to refer to the anonymous square function:

The define form needs to be special, since the symbol it’s about to bind doesn’t yet exist. It must suppress evaluation of the symbol.

The Extension Tag

Most developers believe that JSON is a closed format, ie impossible to extend. One big problem that many experience with JSON is that there is no standard format for dates. Developers come up with their own date formats and encode them as strings. The receiving end must somehow know that a given field contains this special format, and be able to decode it properly. Even if everyone did agree on a standard date format, there is no way for the receiving end to automatically detect that a field is a date field. However, JSON already contains the solution: the undocumented extension tag, #.

The character # followed immediately by a symbol, indicates that the symbol is an extension tag. A tag indicates that the following element should be interpreted in a custom semantic way. When encountering a tag, the interpreter will pass the next element to the corresponding handler for further interpretation. The tag and the following element will be replaced with the result of the handler. The JSON reader implementation should allow clients to dynamically register custom tag handlers.

Here are a few examples of possible extension tags:

This allows for pure JSON code that has a standard format for dates, UUIDs, or whatever custom field needs to be transferred:

Note that the sender and the receiver can both have completely different implementations of an extension tag. For example, a JavaScript Date on the client side might be converted to #inst "2013-04-01T00:00:00Z", transmitted as JSON, and read by a Java JSON reader as a java.util.Date, or perhaps a JodaTime Date.

Built-in tags

The JSON interpreter currently has a few built-in tags:

#inst “rfc-3339-format”

An instant in time. The tagged element is a string in RFC-3339 format, which is identical to ISO-8601 (except for how an unknown local offset is denotated).

#uuid “6330ba21-bb10-4b9f-ad43-80ccdda5c646″

A universally unique identifier, UUID. The tagged element is a version 4 (random) UUID in canonical string representation, ie 32 hexadecimal digits, displayed in five groups separated by hyphens, in the form 8-4-4-4-12 for a total of 36 characters.

Future Improvements

Commas as Whitespace

One possible improvement in readability would be to treat commas outside of strings as whitespace. Commas could be added if they improve readability, for example between key-value pairs in objects. This feature would turn this somewhat tedious factorial implementation:

into this marvel of clarity:

This feature would break backwards compatibility for newly written JSON, but old code could still be read.

A Fourth Data Structure: Set

It would be beneficial, in my opinion, to provide not only the data structures Object, Array, and List, but also a Set. A Set is a collection of unique values. Sets can be represented by zero or more elements enclosed in curly braces preceded by #:

Of course, if the “commas as whitespace” feature is implemented, it would look like this:

Conclusion

I hope I have convinced you that JSON++, ie JSON together with the three undocumented features List, Symbol, and Extension Tag, provides for a powerful, dynamic, functional programming language, without any of the legacy of JavaScript.

Extension tags not only solves the ubiquitous problem of transferring dates in JSON, it also provides built-in support for UUIDs, and indeed any custom data format that can be represented using JSON literals.

Who knows, with the proposed improvements, ie “Commas as Whitespace” and “A Fourth Data Structure: Set”, JSON++ might even evolve into the data format of the future.

References

* JSON: http://json.org
* Extensible Data Notation (edn): http://edn-format.org

18 Comments

  1. Rajan

    Excellent article. Special Form in JavaScript seems like Clojure Forms. Very powerful construct.

  2. Which JSON parsers support these lists and LISP-like semantics?

  3. Peter Galiba

    Nice April Fool’s article.

  4. Thanks, excellent article

  5. wissem

    I dont’t get it .. is it an april foll really? .. thanks anyway…

  6. Pete A

    Oh dear… this has now been linked by Code Project… I foresee a whole set of problems caused by junior developers who don’t recognise the joke…

  7. I want a special JSON parameterized function called “DebugAll” or “FixMyCode”. lol, Cheers!

  8. brian m

    Not a unique April fools joke by any means.

    Microsoft did it last year – they released a preview called Windows 8 and people actually believed it was an operating system!

    • ChadF

      Yes.. but when will that Win8 joke be over? I’m still waiting.

      I guess the real joke is on Vista. Until now it was the poster child for failure (with its side kick WinME). “Sorry Vista, you’re not #1 at being the worst OS anymore!”

  9. Jokes aside… I like the idea of the extensions to specify data types… I would also like that JSON would remove the need for quotes in the key part of the object description of the key does not contains special characters or space.

    • Ulrik Sandberg

      Yes, the extension hash tag thing is brilliant. The cool thing is that you can have the extension tag, the set, non-string map keys, keywords, symbols, and more, for real, if you use edn:

      http://edn-format.org/

Trackbacks for this post

  1. Three Undocumented Features of JSON
  2. Three Undocumented Features of JSON | The code is art
  3. Phil 4.3.13 | wccdev

Leave a Reply