[xml-dev] Stick with XML ... JSON is a minefield of security risks and ambiguities

Sorry for cross-posting but came across this message from Jim Melton to xml-dev this evening. A little off topic.

I’ve tried to explain my reluctance to write standards like they were implementation guides before, peppered with examples and explanatory matter. But (not surprisingly) Jim has done a much better job than me here. Enjoy.

···

---------- Forwarded message ----------
From: Jim Melton jim.melton@oracle.com
Date: 2 November 2016 at 21:45
Subject: Re: [xml-dev] Stick with XML … JSON is a minefield of security risks and ambiguities
To: “Costello, Roger L.” costello@mitre.org, “xml-dev@lists.xml.orgxml-dev@lists.xml.org
Cc: jim.melton@oracle.com

Roger,

` Thanks for this message and for the link to your paper. I
have a few remarks to offer, if I may.

  `

` JSON is not inherently “bad”, nor is it inherently “good”.
Neither is XML, nor COBOL, nor SQL (my particular
standardization specialty), nor ice cream. Too many people seem
to grab hold of the latest notion and believe that it might
finally be the “silver bullet” that solves all of our problems,
simply and elegantly. Eventually, the hype wears off,
intellectually honest people see the reality, which is that all
of those things are merely tools.

  `

` JSON has an important place in the information technology
ecosystem, which is acknowledged by the support that both XQuery
and SQL have given it.

  `

` However, like so many other tools, the “inventors” of JSON
documented their creation with text that is a kind of cross
between a specification and a user manual. As you well know,
the two kinds of documents serve vastly different purposes and
vastly different audiences. When I joined the SQL
standardization effort a third of a century ago, the standard in
development was very much a user manual with some attention paid
to edge cases. That was not due to laziness, but was a
deliberate choice in the interests of “simplicity”. And,
naturally, the result was widely – even wildly – different
implementations, a situation from which it was never possible to
recover.

  `

` Ensuing editions of the standard have been very carefully
rewritten for precision and technical clarity, and we’re not
afraid to rewrite even complex sections when (not if!) lack of
precision or ambiguity are reported. The result has been
significantly more interoperability of features added in the
last three decades. Of course, the resulting standard is large,
complex, and most people would say unapproachable by application
programmers. But implementors love the fact that they don’t
(often!) have to wonder what the standard really means.

  `

Tools: usually good.

`Perfect tools: not available.

  `

Universal solutions: will never exist.

`Thanks again,

       Jim`

  On 2016-11-02 05:18, Costello, Roger L. wrote:

Hi Folks,

Excellent paper on JSON at last week’s
Soft-Shake Conference in Geneva. (http://seriot.ch/parsing_json.html )

        Below

are some extracts from the paper.

But first, a lesson learned:

      Simple is good but

a simple, incomplete specification,

      such as the JSON

specification, leads to security flaws,

      lack of

interoperability, crashes and denial of services.

        Sometimes simple

specifications just mean

hidden complexity.

        Out of

over 30 JSON parsers, no two parsers parsed the same set of
documents the same way.

  •          JSON is not the easy, idealized format as many
    

do believe.*

  •          Edge cases and maliciously crafted payloads can
    

cause bugs, crashes and denial of services, mainly because
JSON libraries rely on specifications that have evolved
over time and that left many details loosely specified or
not specified at all.*

        The

conciseness of the grammar leaves many aspects undefined.

        I [the

author of the paper] wrote a corpus of JSON test files and
documented how selected JSON parsers chose to handle these
files … There were no two parsers that exhibited the same
behavior, which may cause serious interoperability issues.

        JSON is

not a data format you can rely on blindly. I’ve demonstrated
this by showing that the standard definition is spread out
over at least six different documents (section 1 ),
that the latest and most complete document, RFC-7159, is
imprecise and contradictory (section 2 ),
and by crafting test files that out of over 30 parsers, no
two parsers parsed the same set of documents the same way (section 4).

        As a

final word, I keep on wondering why “fragile” formats such
as HTML, CSS and JSON, or “dangerous” languages such as PHP
or JavaScript became so immensely popular. This is probably
because they are easy to start with by tweaking contents in
a text editor, because of too liberal parsers or
interpreters, and seemingly simple specifications. But
sometimes, simple specifications just mean hidden
complexity.

-- ========================================================================
Jim Melton --- Editor of ISO/IEC 9075-* (SQL) Phone: +1.801.942.0144
  Chair, ISO/IEC JTC1/SC32 and W3C XML Query WG Fax : +1.801.942.3345
Oracle Corporation Oracle Email: jim dot melton at oracle dot com
1930 Viscounti Drive Alternate email: jim dot melton at acm dot org
Sandy, UT 84093-1063 USA Personal email: SheltieJim at xmission dot com
========================================================================
= Facts are facts. But any opinions expressed are the opinions =
= only of myself and may or may not reflect the opinions of anybody =
= else with whom I may or may not have discussed the issues at hand. =
========================================================================

Bob,

Good to hear from you friend!

Thanks for sharing the message. There’s wisdom in Jim’s words (and I may have to steal a few of them for future use!)

Best,

Shaun

···

On Wed, Nov 2, 2016 at 6:42 PM, Bob Jolliffe bobjolliffe@gmail.com wrote:

Sorry for cross-posting but came across this message from Jim Melton to xml-dev this evening. A little off topic.

I’ve tried to explain my reluctance to write standards like they were implementation guides before, peppered with examples and explanatory matter. But (not surprisingly) Jim has done a much better job than me here. Enjoy.

---------- Forwarded message ----------
From: Jim Melton jim.melton@oracle.com
Date: 2 November 2016 at 21:45
Subject: Re: [xml-dev] Stick with XML … JSON is a minefield of security risks and ambiguities
To: “Costello, Roger L.” costello@mitre.org, “xml-dev@lists.xml.orgxml-dev@lists.xml.org
Cc: jim.melton@oracle.com

Roger,

` Thanks for this message and for the link to your paper. I
have a few remarks to offer, if I may.

  `

` JSON is not inherently “bad”, nor is it inherently “good”.
Neither is XML, nor COBOL, nor SQL (my particular
standardization specialty), nor ice cream. Too many people seem
to grab hold of the latest notion and believe that it might
finally be the “silver bullet” that solves all of our problems,
simply and elegantly. Eventually, the hype wears off,
intellectually honest people see the reality, which is that all
of those things are merely tools.

  `

` JSON has an important place in the information technology
ecosystem, which is acknowledged by the support that both XQuery
and SQL have given it.

  `

` However, like so many other tools, the “inventors” of JSON
documented their creation with text that is a kind of cross
between a specification and a user manual. As you well know,
the two kinds of documents serve vastly different purposes and
vastly different audiences. When I joined the SQL
standardization effort a third of a century ago, the standard in
development was very much a user manual with some attention paid
to edge cases. That was not due to laziness, but was a
deliberate choice in the interests of “simplicity”. And,
naturally, the result was widely – even wildly – different
implementations, a situation from which it was never possible to
recover.

  `

` Ensuing editions of the standard have been very carefully
rewritten for precision and technical clarity, and we’re not
afraid to rewrite even complex sections when (not if!) lack of
precision or ambiguity are reported. The result has been
significantly more interoperability of features added in the
last three decades. Of course, the resulting standard is large,
complex, and most people would say unapproachable by application
programmers. But implementors love the fact that they don’t
(often!) have to wonder what the standard really means.

  `

Tools: usually good.

`Perfect tools: not available.

  `

Universal solutions: will never exist.

`Thanks again,

       Jim`
  On 2016-11-02 05:18, Costello, Roger L.

wrote:

Hi Folks,

Excellent paper on JSON at last week’s
Soft-Shake Conference in Geneva. (http://seriot.ch/parsing_json.html )

        Below

are some extracts from the paper.

But first, a lesson learned:

      Simple is good but

a simple, incomplete specification,

      such as the JSON

specification, leads to security flaws,

      lack of

interoperability, crashes and denial of services.

        Sometimes simple

specifications just mean

hidden complexity.

        Out of

over 30 JSON parsers, no two parsers parsed the same set of
documents the same way.

  •          JSON is not the easy, idealized format as many
    

do believe.*

  •          Edge cases and maliciously crafted payloads can
    

cause bugs, crashes and denial of services, mainly because
JSON libraries rely on specifications that have evolved
over time and that left many details loosely specified or
not specified at all.*

        The

conciseness of the grammar leaves many aspects undefined.

        I [the

author of the paper] wrote a corpus of JSON test files and
documented how selected JSON parsers chose to handle these
files … There were no two parsers that exhibited the same
behavior, which may cause serious interoperability issues.

        JSON is

not a data format you can rely on blindly. I’ve demonstrated
this by showing that the standard definition is spread out
over at least six different documents (section 1 ),
that the latest and most complete document, RFC-7159, is
imprecise and contradictory (section 2 ),
and by crafting test files that out of over 30 parsers, no
two parsers parsed the same set of documents the same way (section 4).

        As a

final word, I keep on wondering why “fragile” formats such
as HTML, CSS and JSON, or “dangerous” languages such as PHP
or JavaScript became so immensely popular. This is probably
because they are easy to start with by tweaking contents in
a text editor, because of too liberal parsers or
interpreters, and seemingly simple specifications. But
sometimes, simple specifications just mean hidden
complexity.

-- ========================================================================
Jim Melton --- Editor of ISO/IEC 9075-* (SQL) Phone: +1.801.942.0144
  Chair, ISO/IEC JTC1/SC32 and W3C XML Query WG Fax : +1.801.942.3345
Oracle Corporation Oracle Email: jim dot melton at oracle dot com
1930 Viscounti Drive Alternate email: jim dot melton at acm dot org
Sandy, UT 84093-1063 USA Personal email: SheltieJim at xmission dot com
========================================================================
= Facts are facts. But any opinions expressed are the opinions =
= only of myself and may or may not reflect the opinions of anybody =
= else with whom I may or may not have discussed the issues at hand. =
========================================================================

You received this message because you are subscribed to the Google Groups “Open HMIS” group.

To unsubscribe from this group and stop receiving emails from it, send an email to open-hmis+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


Shaun J. Grannis, MD MS FACMI FAAFP
Biomedical Research Scientist, The Regenstrief Institute
Associate Professor, I.U. School of Medicine
410 West 10th Street, Suite 2000
Indianapolis, IN 46202
(317) 274-9092 (Office)
(317) 274-9305 (Fax)