Tool

YAML ↔ JSON.

Round-trip between YAML 1.2 and JSON. Multi-document YAML (the --- separator) becomes a JSON array; comments are stripped on the way in (JSON has none). Powered by js-yaml, running entirely in your browser.

Direction
YAML → JSON
Bytes in
547
Bytes out
846

Input
Samples
Output
{
  "apiVersion": "apps/v1",
  "kind": "Deployment",
  "metadata": {
    "name": "payments-api",
    "labels": {
      "app": "payments",
      "tier": "backend"
    }
  },
  "spec": {
    "replicas": 3,
    "selector": {
      "matchLabels": {
        "app": "payments"
      }
    },
    "template": {
      "spec": {
        "containers": [
          {
            "name": "payments",
            "image": "ghcr.io/example/payments:1.42.0",
            "ports": [
              {
                "containerPort": 8080
              }
            ],
            "resources": {
              "requests": {
                "cpu": "200m",
                "memory": "256Mi"
              },
              "limits": {
                "cpu": "1",
                "memory": "1Gi"
              }
            }
          }
        ]
      }
    }
  }
}

Yet Another Markup Language, renamed.

YAML was first published by Clark Evans in 2001, with Ingy döt Net and Oren Ben-Kiki joining as co-designers shortly afterwards. The acronym was originally expanded as "Yet Another Markup Language", a tongue-in-cheek nod to the XML-saturated world of the late 1990s, but was retconned almost immediately into the recursive "YAML Ain't Markup Language" once the authors realised the format was not, in any structural sense, a markup language at all. It is a data serialisation format, closer in spirit to s-expressions or Python's pretty-printed dictionaries than to SGML. The explicit design goal was a serialisation that humans could read and edit without tooling, that machines could round-trip losslessly, and that mapped cleanly onto the native data types of dynamic languages.

The version history is worth knowing because almost every footgun in YAML is a 1.1-versus-1.2 issue. YAML 1.0 landed in 2004, 1.1 in 2005, and 1.2 in 2009. The 1.2 revision was the major one: it explicitly aligned YAML with JSON, dropped a long list of implicit type coercions, and tightened the boolean grammar. A 1.2.2 errata release was published in October 2021 to clean up two decades of accumulated specification bugs without changing semantics. Despite 1.2 being seventeen years old at the time of writing, a sobering number of production parsers still default to 1.1 behaviour, which is the proximate cause of most of the surprises later in this article.

YAML's adoption curve tracked the rise of declarative infrastructure. Kubernetes manifests, Ansible playbooks, GitHub Actions workflows, GitLab CI pipelines, Docker Compose files, CircleCI configs, AWS CloudFormation, Helm charts, and Prometheus rules all chose YAML as their surface syntax. The reasoning was consistent across these projects: YAML allows comments (JSON does not), supports multi-line strings without escaping hell, and reads naturally for the kinds of nested key-value structures that infrastructure-as-code produces. The reference implementations underpinning this ecosystem are a small set: libyaml in C, PyYAML and ruamel.yaml in Python, SnakeYAML on the JVM, go-yaml v2 and v3 in Go, and js-yaml in JavaScript — the parser this page's converter uses.

Every JSON document is valid YAML.

The most important and least appreciated fact about YAML 1.2 is that it is a strict syntactic superset of JSON. Every well-formed JSON document is, byte-for-byte, a well-formed YAML 1.2 document with identical semantics. This was not true of YAML 1.1, where edge cases around numeric tokens and the boolean set diverged from JSON, and it is the single change that makes tools like the YAML-to-JSON converter on this page tractable to build. You can paste raw JSON into a YAML parser and get the expected tree back; you can serialise a YAML document into JSON and, provided you stayed inside the JSON-representable subset, get a faithful round-trip.

YAML achieves this by supporting two parallel syntaxes. Block style is the indentation-driven form most people picture when they think of YAML — keys on their own lines, sequences as dash-prefixed entries. Flow style uses braces and brackets and looks essentially identical to JSON. Most production YAML documents are a hybrid that leans heavily on block style for readability and drops into flow style only for short inline lists.

FeatureJSONYAML 1.1YAML 1.2
Commentsnoyes (#)yes (#)
Trailing commasnoallowed in flowallowed in flow
Boolean tokenstrue / falsey, Y, yes, on, …true / false only
Octal literal 010decimal 10octal 8decimal 10
Multi-documentnoyes (---)yes (---)
Anchors / aliasesnoyesyes

In practice, the YAML you find in a Kubernetes repository or an Ansible role is a small, well-behaved subset: block mappings, block sequences, flow scalars, the occasional block scalar (| for literal, > for folded), and very little else. Anchors, tags, and directives almost never appear. This is not an accident; it is the lived experience of teams who have been bitten by the parts of the spec that the rest of this article is about.

The Norway problem and friends.

The Norway problem is the canonical example. In YAML 1.1, the unquoted token no parses as the boolean false, alongside n, N, No, NO, false, False, FALSE, off, Off, OFF, and several others. A list of ISO 3166 country codes containing - NO for Norway therefore deserialises to [..., false, ...] in any 1.1 parser. The same trap catches - DE against the German federal-state code "DE" in some custom schemas, and any field that legitimately contains the string "yes" or "on". YAML 1.2 fixed this by restricting the boolean set to true and false, but tooling lag means production parsers still trip over it.

Octal literals are the second classic. Under YAML 1.1, an unquoted token matching 0[0-9]+ is parsed as octal. The Massachusetts ZIP code 02134 becomes the integer 1116. Phone-number prefixes, employee IDs with leading zeros, and version strings have all been silently mangled by this rule. YAML 1.1 also recognised sexagesimal numbers — colon-separated digit groups — so the MAC address 01:23:45:67:89:ab and the time string 12:34:56 were liable to be parsed as base-60 integers. YAML 1.2 dropped both behaviours.

Indentation must be spaces. Tabs are forbidden as indentation characters anywhere in the document, and the resulting parse error is rarely friendly. The directives are syntactically valid but almost never seen in the wild and are silently ignored by some parsers. Tags — the double-bang prefix — are where security stops being theoretical. PyYAML's yaml.load will, by default, instantiate arbitrary Python objects via tags such as one that maps to os.system. CVE-2017-18342 covered exactly this in PyYAML, and the same class of vulnerability has surfaced repeatedly across language bindings: SnakeYAML had CVE-2022-1471 for unsafe constructor invocation, and Ruby's Psych had its own remote-code-execution disclosures.

Always use safe loaders

If you are processing YAML from any source that is not fully trusted — webhook payloads, customer uploads, third-party Helm charts, repository forks — you must use a safe loader. The default load in PyYAML versions before 5.1 is a remote code execution primitive. Audit your dependencies; this is not a hypothetical.

DRY config and its trade-offs.

YAML offers a native mechanism for de-duplication: anchors and aliases. An anchor (the ampersand prefix) marks a node, and an alias (the asterisk prefix) references it. Combined with the merge key, you can splice the contents of one mapping into another, which is useful for, say, sharing a common environment block across three GitLab CI jobs. The feature dates to YAML 1.1 and survives in 1.2, although the merge key itself is technically a separate specification (the merge type) and is not implemented uniformly across parsers — go-yaml v3 dropped support for it in some configurations, and there is ongoing debate about whether it belongs in YAML 1.3.

Kubernetes is a revealing case study. Anchors and aliases are syntactically valid in any manifest, but the Kubernetes community has effectively standardised on not using them. The reason is operational: kubectl diff, GitOps reconciliation, and audit tooling all benefit from manifests that are textually self-contained. An anchor that lives 400 lines above its alias makes reviewers' lives harder, and the resulting expanded form on the cluster no longer matches the source on disk.

This is the headwater of the YAML-templating swamp. Once teams want real abstraction — a value computed from another value, a loop generating ten near-identical resources, a conditional include — they reach for Helm (Go templates over YAML), Ansible's Jinja2-in-YAML, or Kustomize's overlay model. Each of these lets you write strings that are not valid YAML until after templating, which means editor support, schema validation, and syntax highlighting all break at the source level. The reaction has been a generation of typed configuration languages: jsonnet (Google), CUE (Marcel van Lohuizen and others), Dhall (a total functional language), and Pkl (Apple, open-sourced in 2024). Kustomize occupies a middle ground, manipulating valid YAML through structured patches rather than text templating.

The --- separator.

A single YAML file can contain multiple documents separated by --- on its own line, optionally terminated by .... This is a feature JSON does not have and cannot easily emulate, and it is the reason kubectl apply -f bundle.yaml works against a file containing a Deployment, a Service, and a ConfigMap. GitOps tools such as Argo CD and Flux rely on it heavily; so do Helm-rendered output, kustomize build, and Ansible inventories with multiple plays.

Multi-document YAML enables stream parsing: a parser can yield one document at a time without holding the entire file in memory, which matters when a Helm chart for a large platform expands to tens of thousands of lines. Any conversion to JSON, however, requires a representational choice. JSON has no document-stream concept, so a multi-doc YAML file must be either emitted as a JSON array of objects or split into N separate JSON files. The converter on this page wraps multi-document input in a top-level array, which is the convention NDJSON-adjacent tools have settled on, but it does mean the round-trip from JSON back to YAML is not unique — a JSON array could equally well represent a single YAML sequence.

List kind vs document stream

When you pipe kubectl get all -o yaml into a JSON-consuming tool, remember that the output is a single document with a List kind, not a YAML stream of separators. The two look similar at a glance and behave differently under jq.

Configuration languages that aren't YAML.

YAML is a human-authoring format that pays for its readability in parser complexity and ambiguity. If a file is only ever produced and consumed by machines — internal RPC payloads, cache entries, log records, message-queue bodies — JSON is unambiguously the better choice. It is smaller on the wire, parses an order of magnitude faster, has no surprising coercions, and every language ships a parser in its standard library. Reach for YAML only when humans will read or edit the file by hand.

For configuration specifically, several focused alternatives have eclipsed YAML in their respective niches. TOML, designed by Tom Preston-Werner, powers Rust's Cargo, Python's pyproject.toml, and a growing share of CLI tool configs; it is deliberately less expressive than YAML and correspondingly less surprising. HCL — HashiCorp Configuration Language — drives Terraform and is built around the resource-graph mental model rather than tree serialisation, which makes references and interpolations first-class. For configuration that needs computation, types, or imports, the modern choices are jsonnet, CUE, Dhall, and Pkl. Pkl in particular, released by Apple in February 2024, treats configuration as a typed program with schema validation, late binding, and output adapters for YAML, JSON, plist, and properties files.

The pattern across these tools is the same: configuration is code, and code deserves a real language with a type system, a module system, and a test story. YAML earned its place as the lingua franca of declarative infrastructure, but the next decade of platform engineering is being written in languages that emit YAML, not in YAML itself. Knowing when to convert, when to template, and when to escape the format entirely is increasingly part of the staff-engineer remit.

Why ops teams keep getting bitten.

Indentation matters and tabs are illegal — but most editors silently autocorrect spaces to tabs in places where YAML rejects them. Quoted vs unquoted strings parse differently: version: 1.10 becomes the float 1.1, while version: "1.10" stays a string. k8s resource limits like cpu: 1 get parsed as a number instead of the expected string; you must quote them. Anchors and aliases (&ref and *ref) make config DRYer but introduce ordering dependencies that aren't visible in the final document. Many shops have moved to jsonnet, cue, or dhall specifically to escape these footguns.

Number coercion ate my zip codes

A 5-digit US zip code like 02134 parses as the integer 1116 in YAML 1.1 (octal) or 2134 in 1.2 (leading zero stripped). Always quote strings that look like numbers but aren't.

Found this useful?