YAML Data
YAML serializes data as indentation-structured key-value pairs, sequences, and mappings in plain UTF-8 text. Standardized as YAML 1.2 (2009) and registered as application/yaml (RFC 9512, 2024), it is the default configuration format for Kubernetes, Docker Compose, GitHub Actions, and Ansible.
Configuration/data format. Conversion between data formats requires semantic mapping.
Common questions
YAML vs JSON — which should I use?
Use YAML when humans read and write the file directly (configuration, CI/CD pipelines, Kubernetes manifests). Use JSON when machines produce and consume the file (APIs, data storage, language interop). YAML 1.2 is a superset of JSON, so any valid JSON is valid YAML.
Why do tabs cause YAML parsing errors?
The YAML specification forbids tab characters for indentation — only spaces are allowed. Tabs render at different widths in different editors, which would make indentation-based structure ambiguous. Configure your editor to insert spaces when pressing Tab for YAML files.
What is the Norwegian boolean problem in YAML?
YAML 1.1 parsers treat unquoted 'yes', 'no', 'on', 'off' as booleans. The ISO country code for Norway (NO) silently becomes false. YAML 1.2 fixed this, but PyYAML and many tools still default to 1.1 rules. Always quote ambiguous values: country: 'NO'.
What is a YAML billion laughs attack?
The billion laughs attack exploits YAML anchor/alias expansion. A small file defines a chain of anchors where each references the previous multiple times, causing exponential memory growth. A 1 KB file can consume gigabytes of RAM. Mitigate by setting alias expansion limits in your parser or rejecting untrusted YAML input.
What makes .YAML special
What is a YAML file?
YAML (YAML Ain't Markup Language) is a human-friendly data serialization format used extensively for configuration files. YAML uses indentation to represent structure, making it more readable than JSON or XML. It supports scalars, lists, maps, and complex nested structures with anchors and aliases.
Continue reading — full technical deep dive
The format was designed to solve a specific problem: XML was the dominant configuration format in the early 2000s, but it was verbose and hard to read. YAML's whitespace-based structure mirrors how humans naturally write structured text, which is why it became the configuration language of choice for DevOps tooling, CI/CD pipelines, and cloud-native infrastructure.
How to open YAML files
- VS Code (Windows, macOS, Linux) — Built-in YAML support; Red Hat YAML extension adds schema validation
- Any text editor — YAML files are plain UTF-8 text; indentation is critical so use an editor that shows whitespace
- YAML Lint (Web) — Online validation at yamllint.com
- PyYAML (Python) — Programmatic parsing:
yaml.safe_load(open('file.yaml')) - yq (CLI) — Query and transform YAML from the command line
Technical specifications
| Property | Value |
|---|---|
| Version | YAML 1.2 (2009), with 1.2.2 revision (2021) |
| Encoding | UTF-8, UTF-16, UTF-32 (UTF-8 strongly recommended) |
| Structure | Indentation-based (spaces only — tabs forbidden) |
| Data types | Strings, integers, floats, booleans, null, dates, binary |
| JSON superset | YAML 1.2+ is a strict superset of JSON |
| Comments | Supported (# comment) — unique among serialization formats |
| Multi-document | Multiple YAML documents in one file, separated by --- |
Common use cases
- DevOps: Docker Compose, Kubernetes manifests, Ansible playbooks
- CI/CD: GitHub Actions, GitLab CI, CircleCI, Drone CI workflow definitions
- Application config: Spring Boot (
application.yaml), Rails, Jekyll, Hugo - API specs: OpenAPI/Swagger definitions (both YAML and JSON variants)
- Infrastructure as Code: Helm charts (Kubernetes package manager), AWS CloudFormation
YAML syntax quick reference
# Scalars
name: FileDex
version: 2
active: true
ratio: 3.14
nothing: null
# Multi-line string (literal block — preserves newlines)
description: |
First line.
Second line.
Third line.
# Sequences (lists)
features:
- search
- convert
- multilingual
# Mappings (nested objects)
database:
host: localhost
port: 5432
name: filedex_db
# Anchors and aliases (DRY config)
defaults: &defaults
timeout: 30
retries: 3
production:
<<: *defaults # merge key — inherit defaults
timeout: 60 # override one value
The Norwegian boolean problem
One of YAML's most notorious quirks: in YAML 1.1 (used by many parsers), the following values are automatically coerced to booleans — yes, no, on, off, true, false (all case-insensitive). This means country: NO parses as false, not the string "NO" (Norway's ISO 3166-1 code). YAML 1.2 fixed this, but many parsers still use 1.1 rules.
Solution: Always quote ambiguous values: country: "NO" or country: 'off'.
.YAML compared to alternatives
| Formats | Criteria | Winner |
|---|---|---|
| .YAML vs .JSON | Human readability YAML uses indentation instead of braces, supports comments, and does not require quotes around most strings. JSON requires double-quoted keys, braces, brackets, and has no comment syntax. | YAML wins |
| .YAML vs .JSON | Parsing safety JSON has exactly one valid parse for any input. YAML has implicit type coercion (country code 'NO' becomes boolean false), whitespace sensitivity, and anchor/alias expansion that can cause memory exhaustion. | JSON wins |
| .YAML vs .TOML | Complex nesting YAML handles deeply nested structures with inline flow style or indentation. TOML requires verbose dotted keys or repeated [section.subsection] headers for deep nesting. | YAML wins |
| .YAML vs .TOML | Type safety TOML has explicit, unambiguous types with no implicit coercion. YAML silently converts unquoted 'yes', 'no', 'null', and numeric-looking strings to booleans, null, or numbers. | TOML wins |
Technical reference
- MIME Type
application/x-yaml- Developer
- Clark Evans / Oren Ben-Kiki / Ingy döt Net
- Year Introduced
- 2001
- Open Standard
- Yes — View specification
Binary Structure
YAML is a text format with no binary structure. A YAML stream may contain multiple documents separated by document-start markers (---) and optionally terminated by document-end markers (...). Each document consists of a root node that is either a mapping (key-value pairs), a sequence (ordered list), or a scalar (string, integer, float, boolean, null, timestamp). Mappings use colon-space (: ) as the key-value separator. Sequences use dash-space (- ) as the item prefix. Indentation with spaces (tabs are forbidden) defines nesting depth. Scalars can be plain (unquoted), single-quoted, double-quoted, or block scalars using literal (|) or folded (>) indicators. Anchors (&name) define reusable nodes; aliases (*name) reference them. Merge keys (<<: *anchor) combine mappings. Tags (!!str, !!int, !!python/object) specify or override type resolution. Comments begin with # and extend to end of line. YAML 1.2 is a strict superset of JSON — any valid JSON document is valid YAML 1.2. Character encoding must be UTF-8 (recommended), UTF-16, or UTF-32. No BOM is required; if present, it indicates encoding.
Attack Vectors
- Arbitrary code execution via yaml.load() (Python) — YAML tags like !!python/object/apply:os.system allow arbitrary command execution during parsing (CVE-2017-18342)
- Billion laughs / YAML bomb — anchor/alias chains cause exponential memory expansion; a 1 KB file can consume gigabytes of RAM, causing denial of service
- Implicit type coercion — unquoted values silently converted to booleans, null, or numbers, causing logic errors and potential security bypasses in configuration-driven access control
- Ruby YAML.load vulnerability — Ruby's Psych YAML loader can instantiate arbitrary Ruby objects from YAML tags, enabling remote code execution in Rails applications
Mitigation: FileDex processes YAML files entirely in the browser using js-yaml's safeLoad mode. No server-side parsing, no code execution from YAML tags, no external resource loading.