JavaScript Object Notation
JSON (JavaScript Object Notation) stores structured data as UTF-8 text using six value types: strings, numbers, booleans, null, objects, and arrays. Independently standardized as both RFC 8259 (IETF) and ECMA-404 (Ecma International), it is the dominant data interchange format for web APIs, configuration files, and document databases.
Conversion not yet available. JSON to CSV or XML transformation requires schema mapping — a feature under consideration.
Common questions
What is a JSON file and what is it used for?
A JSON file stores structured data as human-readable text using key-value pairs and arrays. It is the standard format for web API responses, application configuration files like package.json and tsconfig.json, and NoSQL database documents. JSON is supported natively by every major programming language and can be opened in any text editor.
Can JSON have comments?
No. JSON does not support comments of any kind. Douglas Crockford deliberately excluded them from the specification to prevent their misuse as parsing directives, which had caused interoperability problems in earlier formats. For configuration files that need inline annotations, use JSONC (the format used by VS Code settings), JSON5, or YAML as alternatives.
What is the difference between JSON and XML?
JSON uses key-value pairs and arrays with minimal syntax. XML uses nested tags with opening and closing elements, attributes, and namespaces. JSON is 30-50 percent smaller for the same data and maps directly to native language data structures, which is why it replaced XML as the standard API format.
What is NDJSON?
NDJSON (Newline-Delimited JSON) places one complete JSON value per line, separated by newline characters. Unlike standard JSON which is a single document, NDJSON is a stream format used for log files, data pipelines, and APIs where the total size is unknown.
Why do large numbers lose precision in JSON?
Most JSON parsers use IEEE 754 double-precision floating point, which limits safe integer precision to 53 bits. Integers beyond 9,007,199,254,740,991 round silently, corrupting values without any error message. Twitter encountered this problem with tweet IDs and added string-typed fields as a workaround. Always transmit large numeric identifiers as strings.
How do I validate a JSON file?
Several free tools validate and format JSON. Desktop text editors with JSON support highlight syntax errors instantly. Online validators report the exact line and position of problems. Common mistakes that break JSON include trailing commas after the last item, single-quoted strings instead of double quotes, and unquoted property keys.
Is JSON a programming language?
No. JSON is a data interchange format, not a programming language. It has no variables, functions, loops, or logic. JSON describes data structure only. It was derived from JavaScript syntax but is completely language-independent and used by every major programming language.
What makes .JSON special
Origin: Crockford's discovery
Few data formats can trace their origin to a single person's moment of recognition. Douglas Crockford, working at State Software in 2001, realized that a subset of JavaScript's object literal syntax already functioned as a minimal, language-independent data interchange format. He did not design JSON from scratch — he "discovered" it, as he later put it, hiding inside the ECMAScript 3 specification published in 1999. Crockford registered json.org in 2002 and posted the railroad diagrams that define the entire grammar on a single page. That grammar has not changed since.
Continue reading — full technical deep dive
Dual-specification governance
JSON holds a unique position in the standards world: two independent organizations publish intentionally identical specifications. The IETF maintains RFC 8259 (December 2017), which carries the weight of Internet Standard status (STD 90) and obsoletes the earlier RFC 7159 (2014) and RFC 4627 (2006). Ecma International maintains ECMA-404 (2nd Edition, 2017), and ISO published the same content as ISO/IEC 21778:2017. This arrangement was deliberate. By anchoring JSON in two separate standards bodies, no single organization can unilaterally change the format or impose licensing restrictions. The Library of Congress catalogs JSON as fdd000381, and the UK National Archives assign PRONOM identifier fmt/817.
The six types
JSON values are restricted to exactly six types. Strings must use double quotes — single quotes are a syntax error. Numbers are arbitrary-precision decimals in the grammar, but nearly every implementation parses them as IEEE 754 double-precision floats, which limits integer precision to 53 bits. Booleans are lowercase only: true and false, not True or False (a frequent Python-to-JSON gotcha). The null literal is also lowercase only. Objects are unordered collections of key-value pairs enclosed in curly braces, where every key must be a double-quoted string. Arrays are ordered sequences of values enclosed in square brackets. No other types exist — there is no date, no binary, no undefined, and no comment syntax.
The no-comments controversy
JSON does not support comments of any kind. Neither single-line nor block comments exist in the grammar. Crockford removed them deliberately to prevent their use as parsing directives — a practice that had caused interoperability problems in XML processing instructions. This decision remains the single most frequently asked question about JSON and has spawned an entire family of extensions: JSON5 adds comments along with trailing commas and unquoted keys; JSONC (JSON with Comments) is used internally by VS Code for settings files; HJSON provides a human-friendly superset with comments and multiline strings. None of these extensions are valid JSON per RFC 8259 or ECMA-404.
Number precision: the Twitter problem
The grammar allows numbers of arbitrary precision, but JSON.parse() in JavaScript and equivalent parsers in most other languages map numbers to IEEE 754 double-precision floats. This means integers larger than 2 to the power of 53 (9,007,199,254,740,992) lose precision silently. Twitter encountered this when tweet IDs exceeded the safe integer range — a tweet created with ID 9007199254740993 would round to 9007199254740992 after parsing. Twitter's solution was to add an id_str field containing the same ID as a string, a pattern now common across APIs that handle large identifiers. RFC 8259 Section 6 explicitly warns about this interoperability hazard.
JSON versus XML
JSON displaced XML as the dominant API data format because it is less verbose and maps directly to native data structures. The same data expressed in XML requires opening tags, closing tags, attributes, namespaces, and often a schema declaration — roughly 30 to 50 percent more bytes than the equivalent JSON. JSON objects map to dictionaries or hash maps in every major language without an intermediate deserialization step. XML's strengths — mixed content, attributes, namespaces, and XSLT transformations — remain relevant for document-centric use cases, but for structured data exchange between services, JSON has won decisively.
NDJSON: streaming JSON
Standard JSON is a single document — one root value per file. Newline-Delimited JSON (NDJSON) solves the streaming problem by placing one independent JSON value per line, separated by newline characters. Each line is a complete, parseable JSON document. This makes NDJSON suitable for log files, data pipelines, and streaming APIs where the total document size is unknown or unbounded. NDJSON is already covered as a separate format in FileDex. The key distinction: JSON is a document; NDJSON is a stream.
JSON Schema
JSON itself has no mechanism for declaring the expected structure of a document. JSON Schema (current draft: 2020-12) fills this gap by providing a vocabulary for specifying required fields, value types, string patterns, numeric ranges, array lengths, and cross-field dependencies. Validators like Ajv (JavaScript) and jsonschema (Python) check documents against schemas programmatically, enabling API request validation, configuration file checking, and form generation from schema definitions.
API ubiquity
Virtually every modern web API returns JSON. REST endpoints, GraphQL responses, webhook payloads, OAuth token exchanges, and server-sent events all default to JSON serialization. Configuration files across the JavaScript ecosystem — package.json, tsconfig.json, .eslintrc.json, vercel.json — are JSON. Document databases including MongoDB (via BSON), CouchDB, and Firebase Firestore store data as JSON documents. The format's ubiquity means that understanding JSON is a prerequisite for nearly every area of modern software development.
.JSON compared to alternatives
| Formats | Criteria | Winner |
|---|---|---|
| .JSON vs .XML | Verbosity JSON represents the same data in roughly 30-50% fewer bytes than XML. No closing tags, no attribute syntax, no namespace declarations. Objects and arrays map directly to native language data structures. | JSON wins |
| .JSON vs .YAML | Parsing safety JSON has exactly one valid parse for any input. YAML has implicit type coercion edge cases where 'NO' becomes boolean false and '1.0' becomes a float. YAML's whitespace sensitivity causes subtle bugs that JSON's explicit delimiters avoid. | JSON wins |
| .JSON vs .CSV | Nested data JSON natively represents nested objects, arrays, and mixed types at arbitrary depth. CSV is strictly two-dimensional with no standard way to encode hierarchical structures or mixed-type columns. | JSON wins |
Technical reference
- MIME Type
application/json- Developer
- Douglas Crockford / Ecma International
- Year Introduced
- 2001
- Open Standard
- Yes — View specification
Binary Structure
JSON is a pure text format with no binary structure, no magic bytes, and no file header. Files are encoded in UTF-8 as mandated by RFC 8259 for interchange; earlier specifications permitted UTF-16 and UTF-32, but these are now prohibited. A JSON document consists of a single root value, typically an object (curly braces) or an array (square brackets), though RFC 8259 allows any value type at the root level. Objects contain comma-separated key-value pairs where keys are always double-quoted strings. Arrays contain comma-separated values. Whitespace between tokens — spaces, tabs, line feeds, and carriage returns — is insignificant and ignored by parsers. A BOM (Byte Order Mark) must not be added per RFC 8259 Section 8.1. No trailing commas, no single-quoted strings, no comments, and no undefined value are permitted. Number precision and maximum nesting depth are implementation-dependent.
Conversion not yet available. JSON to CSV or XML transformation requires schema mapping — a feature under consideration.
Attack Vectors
- Prototype pollution — crafted JSON payloads containing __proto__ or constructor keys can modify object prototypes when parsed with unsafe merge or deep-copy functions, enabling property injection across the application
- JSON injection — user input concatenated into JSON strings without proper escaping allows injection of additional key-value pairs or manipulation of existing values in server-side JSON construction
- Insecure deserialization — parsers that instantiate objects based on type hints embedded in JSON data (common in Java and .NET frameworks) can trigger arbitrary code execution via gadget chains
- Denial of service via deeply nested structures — JSON documents with 10,000+ nesting levels can cause stack overflow in recursive parsers, and extremely large string values can exhaust parser memory
Mitigation: FileDex processes JSON files entirely in the browser using native JSON.parse() — no eval(), no server upload, no deserialization into application objects. Parsing occurs within the browser's memory sandbox with built-in stack depth limits.
- Specification RFC 8259 — The JavaScript Object Notation (JSON) Data Interchange Format (Internet Standard, STD 90)
- Specification ECMA-404 (2nd Edition) — The JSON Data Interchange Syntax
- Registry JSON (JavaScript Object Notation) — Library of Congress Format Description (fdd000381)
- Registry JSON Data Interchange Format (fmt/817) — The National Archives PRONOM Registry
- Registry application/json — IANA Media Types
- History JSON — Wikipedia