.JSON JavaScript Object Notation
.json

JavaScript Object Notation

JSON (JavaScript Object Notation) stores structured data as UTF-8 text using six value types: strings, numbers, booleans, null, objects, and arrays. Independently standardized as both RFC 8259 (IETF) and ECMA-404 (Ecma International), it is the dominant data interchange format for web APIs, configuration files, and document databases.

File structure
Header schema
Records structured data
TextRFC 8259ECMA2001
Not convertible

Conversion not yet available. JSON to CSV or XML transformation requires schema mapping — a feature under consideration.

Common questions

What is a JSON file and what is it used for?

A JSON file stores structured data as human-readable text using key-value pairs and arrays. It is the standard format for web API responses, application configuration files like package.json and tsconfig.json, and NoSQL database documents. JSON is supported natively by every major programming language and can be opened in any text editor.

Can JSON have comments?

No. JSON does not support comments of any kind. Douglas Crockford deliberately excluded them from the specification to prevent their misuse as parsing directives, which had caused interoperability problems in earlier formats. For configuration files that need inline annotations, use JSONC (the format used by VS Code settings), JSON5, or YAML as alternatives.

What is the difference between JSON and XML?

JSON uses key-value pairs and arrays with minimal syntax. XML uses nested tags with opening and closing elements, attributes, and namespaces. JSON is 30-50 percent smaller for the same data and maps directly to native language data structures, which is why it replaced XML as the standard API format.

What is NDJSON?

NDJSON (Newline-Delimited JSON) places one complete JSON value per line, separated by newline characters. Unlike standard JSON which is a single document, NDJSON is a stream format used for log files, data pipelines, and APIs where the total size is unknown.

Why do large numbers lose precision in JSON?

Most JSON parsers use IEEE 754 double-precision floating point, which limits safe integer precision to 53 bits. Integers beyond 9,007,199,254,740,991 round silently, corrupting values without any error message. Twitter encountered this problem with tweet IDs and added string-typed fields as a workaround. Always transmit large numeric identifiers as strings.

How do I validate a JSON file?

Several free tools validate and format JSON. Desktop text editors with JSON support highlight syntax errors instantly. Online validators report the exact line and position of problems. Common mistakes that break JSON include trailing commas after the last item, single-quoted strings instead of double quotes, and unquoted property keys.

Is JSON a programming language?

No. JSON is a data interchange format, not a programming language. It has no variables, functions, loops, or logic. JSON describes data structure only. It was derived from JavaScript syntax but is completely language-independent and used by every major programming language.

What makes .JSON special

Discovered, not invented
Crockford found JSON hiding inside JavaScript
In 2001, Douglas Crockford realized a subset of JavaScript ES3 object literal syntax already worked as a language-independent data format. He 'discovered' JSON rather than designing it from scratch.
Dual governance
Two standards bodies publish the same spec
RFC 8259 (IETF) and ECMA-404 (Ecma International) are intentionally identical specifications published in 2017. This unique arrangement ensures no single organization can control JSON.
No-comments holy war
Crockford banned comments on purpose
Comments were deliberately excluded to prevent parsing directives from being embedded in data. This remains the most frequently asked question about JSON and spawned JSON5, JSONC, and HJSON.
XML killer
JSON ended the XML API era
JSON displaced XML as the standard API format because it is 30-50% smaller and maps directly to native data structures. Every major REST API and most GraphQL endpoints return JSON today.

Origin: Crockford's discovery

Few data formats can trace their origin to a single person's moment of recognition. Douglas Crockford, working at State Software in 2001, realized that a subset of JavaScript's object literal syntax already functioned as a minimal, language-independent data interchange format. He did not design JSON from scratch — he "discovered" it, as he later put it, hiding inside the ECMAScript 3 specification published in 1999. Crockford registered json.org in 2002 and posted the railroad diagrams that define the entire grammar on a single page. That grammar has not changed since.

Continue reading — full technical deep dive

Dual-specification governance

JSON holds a unique position in the standards world: two independent organizations publish intentionally identical specifications. The IETF maintains RFC 8259 (December 2017), which carries the weight of Internet Standard status (STD 90) and obsoletes the earlier RFC 7159 (2014) and RFC 4627 (2006). Ecma International maintains ECMA-404 (2nd Edition, 2017), and ISO published the same content as ISO/IEC 21778:2017. This arrangement was deliberate. By anchoring JSON in two separate standards bodies, no single organization can unilaterally change the format or impose licensing restrictions. The Library of Congress catalogs JSON as fdd000381, and the UK National Archives assign PRONOM identifier fmt/817.

The six types

JSON values are restricted to exactly six types. Strings must use double quotes — single quotes are a syntax error. Numbers are arbitrary-precision decimals in the grammar, but nearly every implementation parses them as IEEE 754 double-precision floats, which limits integer precision to 53 bits. Booleans are lowercase only: true and false, not True or False (a frequent Python-to-JSON gotcha). The null literal is also lowercase only. Objects are unordered collections of key-value pairs enclosed in curly braces, where every key must be a double-quoted string. Arrays are ordered sequences of values enclosed in square brackets. No other types exist — there is no date, no binary, no undefined, and no comment syntax.

The no-comments controversy

JSON does not support comments of any kind. Neither single-line nor block comments exist in the grammar. Crockford removed them deliberately to prevent their use as parsing directives — a practice that had caused interoperability problems in XML processing instructions. This decision remains the single most frequently asked question about JSON and has spawned an entire family of extensions: JSON5 adds comments along with trailing commas and unquoted keys; JSONC (JSON with Comments) is used internally by VS Code for settings files; HJSON provides a human-friendly superset with comments and multiline strings. None of these extensions are valid JSON per RFC 8259 or ECMA-404.

Number precision: the Twitter problem

The grammar allows numbers of arbitrary precision, but JSON.parse() in JavaScript and equivalent parsers in most other languages map numbers to IEEE 754 double-precision floats. This means integers larger than 2 to the power of 53 (9,007,199,254,740,992) lose precision silently. Twitter encountered this when tweet IDs exceeded the safe integer range — a tweet created with ID 9007199254740993 would round to 9007199254740992 after parsing. Twitter's solution was to add an id_str field containing the same ID as a string, a pattern now common across APIs that handle large identifiers. RFC 8259 Section 6 explicitly warns about this interoperability hazard.

JSON versus XML

JSON displaced XML as the dominant API data format because it is less verbose and maps directly to native data structures. The same data expressed in XML requires opening tags, closing tags, attributes, namespaces, and often a schema declaration — roughly 30 to 50 percent more bytes than the equivalent JSON. JSON objects map to dictionaries or hash maps in every major language without an intermediate deserialization step. XML's strengths — mixed content, attributes, namespaces, and XSLT transformations — remain relevant for document-centric use cases, but for structured data exchange between services, JSON has won decisively.

NDJSON: streaming JSON

Standard JSON is a single document — one root value per file. Newline-Delimited JSON (NDJSON) solves the streaming problem by placing one independent JSON value per line, separated by newline characters. Each line is a complete, parseable JSON document. This makes NDJSON suitable for log files, data pipelines, and streaming APIs where the total document size is unknown or unbounded. NDJSON is already covered as a separate format in FileDex. The key distinction: JSON is a document; NDJSON is a stream.

JSON Schema

JSON itself has no mechanism for declaring the expected structure of a document. JSON Schema (current draft: 2020-12) fills this gap by providing a vocabulary for specifying required fields, value types, string patterns, numeric ranges, array lengths, and cross-field dependencies. Validators like Ajv (JavaScript) and jsonschema (Python) check documents against schemas programmatically, enabling API request validation, configuration file checking, and form generation from schema definitions.

API ubiquity

Virtually every modern web API returns JSON. REST endpoints, GraphQL responses, webhook payloads, OAuth token exchanges, and server-sent events all default to JSON serialization. Configuration files across the JavaScript ecosystem — package.json, tsconfig.json, .eslintrc.json, vercel.json — are JSON. Document databases including MongoDB (via BSON), CouchDB, and Firebase Firestore store data as JSON documents. The format's ubiquity means that understanding JSON is a prerequisite for nearly every area of modern software development.

.JSON compared to alternatives

.JSON compared to alternative formats
Formats Criteria Winner
.JSON vs .XML
Verbosity
JSON represents the same data in roughly 30-50% fewer bytes than XML. No closing tags, no attribute syntax, no namespace declarations. Objects and arrays map directly to native language data structures.
JSON wins
.JSON vs .YAML
Parsing safety
JSON has exactly one valid parse for any input. YAML has implicit type coercion edge cases where 'NO' becomes boolean false and '1.0' becomes a float. YAML's whitespace sensitivity causes subtle bugs that JSON's explicit delimiters avoid.
JSON wins
.JSON vs .CSV
Nested data
JSON natively represents nested objects, arrays, and mixed types at arbitrary depth. CSV is strictly two-dimensional with no standard way to encode hierarchical structures or mixed-type columns.
JSON wins

Technical reference

MIME Type
application/json
Developer
Douglas Crockford / Ecma International
Year Introduced
2001
Open Standard
Yes — View specification

Binary Structure

JSON is a pure text format with no binary structure, no magic bytes, and no file header. Files are encoded in UTF-8 as mandated by RFC 8259 for interchange; earlier specifications permitted UTF-16 and UTF-32, but these are now prohibited. A JSON document consists of a single root value, typically an object (curly braces) or an array (square brackets), though RFC 8259 allows any value type at the root level. Objects contain comma-separated key-value pairs where keys are always double-quoted strings. Arrays contain comma-separated values. Whitespace between tokens — spaces, tabs, line feeds, and carriage returns — is insignificant and ignored by parsers. A BOM (Byte Order Mark) must not be added per RFC 8259 Section 8.1. No trailing commas, no single-quoted strings, no comments, and no undefined value are permitted. Number precision and maximum nesting depth are implementation-dependent.

1999ECMAScript 3 published — the JavaScript specification containing the object literal syntax that JSON would be derived from2001Douglas Crockford specifies JSON at State Software as a lightweight data interchange format based on a subset of ES32002Crockford registers json.org and publishes the railroad diagrams that define JSON grammar on a single page2006RFC 4627 published — first IETF specification for JSON as an Informational RFC2013ECMA-404 (1st Edition) published by Ecma International, establishing dual-specification governance2017RFC 8259 achieves Internet Standard status (STD 90), mandating UTF-8 and prohibiting BOM — simultaneously published with ECMA-404 2nd Edition and ISO/IEC 21778:20172018JSON5 specification published, adding comments, trailing commas, and unquoted keys to address developer frustrations with strict JSON2020JSON Schema draft 2020-12 released — validation, annotation, and hyper-schema for structured JSON documents
Pretty-print and validate JSON with jq other
jq '.' input.json

jq with the identity filter '.' reads, parses, and pretty-prints JSON with syntax highlighting. Exits with non-zero status on invalid JSON, making it a validator and formatter in one command.

Validate and format JSON with Python standard library other
python3 -m json.tool input.json

Python's built-in json.tool module parses and pretty-prints JSON with 4-space indentation. Exits with an error message on invalid input. No third-party packages required — available anywhere Python is installed.

Extract nested fields with jq other
jq '.data.users[] | {name: .name, email: .email}' input.json

Pipes each element of the .data.users array through a projection, outputting only name and email fields as new JSON objects. jq expressions compose like Unix pipelines for powerful data extraction.

Minify JSON to a single line other
jq -c '.' input.json > output.json

The -c (compact) flag outputs JSON on a single line with no extra whitespace. Reduces file size for deployment and API transmission without changing any data values.

Validate JSON inline with Node.js other
node -e "JSON.parse(require('fs').readFileSync('input.json','utf8')); console.log('Valid JSON')"

Uses Node.js built-in JSON.parse() to validate a file. Throws a SyntaxError with line and column position on invalid input, helping pinpoint the exact location of syntax errors.

Conversion not yet available. JSON to CSV or XML transformation requires schema mapping — a feature under consideration.

LOW

Attack Vectors

  • Prototype pollution — crafted JSON payloads containing __proto__ or constructor keys can modify object prototypes when parsed with unsafe merge or deep-copy functions, enabling property injection across the application
  • JSON injection — user input concatenated into JSON strings without proper escaping allows injection of additional key-value pairs or manipulation of existing values in server-side JSON construction
  • Insecure deserialization — parsers that instantiate objects based on type hints embedded in JSON data (common in Java and .NET frameworks) can trigger arbitrary code execution via gadget chains
  • Denial of service via deeply nested structures — JSON documents with 10,000+ nesting levels can cause stack overflow in recursive parsers, and extremely large string values can exhaust parser memory

Mitigation: FileDex processes JSON files entirely in the browser using native JSON.parse() — no eval(), no server upload, no deserialization into application objects. Parsing occurs within the browser's memory sandbox with built-in stack depth limits.

jq tool
Command-line JSON processor — filter, transform, and query JSON with pipe-style syntax
Vocabulary for validating JSON document structure, types, and constraints (draft 2020-12)
JSONLint service
Online JSON validator and formatter — paste or upload JSON for instant syntax checking
Ajv library
Fastest JSON Schema validator for JavaScript/TypeScript — compiles schemas to optimized code
orjson library
High-performance JSON serialization for Python — 3-10x faster than the standard library json module
simdjson library
SIMD-accelerated JSON parser that processes gigabytes per second using AVX2 and NEON instructions