Sorry for cross-posting but came across this message from Jim Melton to xml-dev this evening. A little off topic.
I’ve tried to explain my reluctance to write standards like they were implementation guides before, peppered with examples and explanatory matter. But (not surprisingly) Jim has done a much better job than me here. Enjoy.
---------- Forwarded message ----------
From: Jim Melton firstname.lastname@example.org
Date: 2 November 2016 at 21:45
Subject: Re: [xml-dev] Stick with XML … JSON is a minefield of security risks and ambiguities
To: “Costello, Roger L.” email@example.com, “firstname.lastname@example.org” email@example.com
` Thanks for this message and for the link to your paper. I
have a few remarks to offer, if I may.
` JSON is not inherently “bad”, nor is it inherently “good”.
Neither is XML, nor COBOL, nor SQL (my particular
standardization specialty), nor ice cream. Too many people seem
to grab hold of the latest notion and believe that it might
finally be the “silver bullet” that solves all of our problems,
simply and elegantly. Eventually, the hype wears off,
intellectually honest people see the reality, which is that all
of those things are merely tools.
` JSON has an important place in the information technology
ecosystem, which is acknowledged by the support that both XQuery
and SQL have given it.
` However, like so many other tools, the “inventors” of JSON
documented their creation with text that is a kind of cross
between a specification and a user manual. As you well know,
the two kinds of documents serve vastly different purposes and
vastly different audiences. When I joined the SQL
standardization effort a third of a century ago, the standard in
development was very much a user manual with some attention paid
to edge cases. That was not due to laziness, but was a
deliberate choice in the interests of “simplicity”. And,
naturally, the result was widely – even wildly – different
implementations, a situation from which it was never possible to
` Ensuing editions of the standard have been very carefully
rewritten for precision and technical clarity, and we’re not
afraid to rewrite even complex sections when (not if!) lack of
precision or ambiguity are reported. The result has been
significantly more interoperability of features added in the
last three decades. Of course, the resulting standard is large,
complex, and most people would say unapproachable by application
programmers. But implementors love the fact that they don’t
(often!) have to wonder what the standard really means.
Tools: usually good.
`Perfect tools: not available.
Universal solutions: will never exist.
Jim` On 2016-11-02 05:18, Costello, Roger L. wrote:
Excellent paper on JSON at last week’s
Soft-Shake Conference in Geneva. (http://seriot.ch/parsing_json.html )
are some extracts from the paper.
But first, a lesson learned:
Simple is good but
a simple, incomplete specification,
such as the JSON
specification, leads to security flaws,
interoperability, crashes and denial of services.
specifications just mean
over 30 JSON parsers, no two parsers parsed the same set of
documents the same way.
JSON is not the easy, idealized format as many
Edge cases and maliciously crafted payloads can
cause bugs, crashes and denial of services, mainly because
JSON libraries rely on specifications that have evolved
over time and that left many details loosely specified or
not specified at all.*
conciseness of the grammar leaves many aspects undefined.
author of the paper] wrote a corpus of JSON test files and
documented how selected JSON parsers chose to handle these
files … There were no two parsers that exhibited the same
behavior, which may cause serious interoperability issues.
not a data format you can rely on blindly. I’ve demonstrated
this by showing that the standard definition is spread out
over at least six different documents (section 1 ),
that the latest and most complete document, RFC-7159, is
imprecise and contradictory (section 2 ),
and by crafting test files that out of over 30 parsers, no
two parsers parsed the same set of documents the same way (section 4).
final word, I keep on wondering why “fragile” formats such
as HTML, CSS and JSON, or “dangerous” languages such as PHP
because they are easy to start with by tweaking contents in
a text editor, because of too liberal parsers or
interpreters, and seemingly simple specifications. But
sometimes, simple specifications just mean hidden
-- ======================================================================== Jim Melton --- Editor of ISO/IEC 9075-* (SQL) Phone: +1.801.942.0144 Chair, ISO/IEC JTC1/SC32 and W3C XML Query WG Fax : +1.801.942.3345 Oracle Corporation Oracle Email: jim dot melton at oracle dot com 1930 Viscounti Drive Alternate email: jim dot melton at acm dot org Sandy, UT 84093-1063 USA Personal email: SheltieJim at xmission dot com ======================================================================== = Facts are facts. But any opinions expressed are the opinions = = only of myself and may or may not reflect the opinions of anybody = = else with whom I may or may not have discussed the issues at hand. = ========================================================================