Tool

JSONpath.

Query JSON with the syntax Stefan Goessner sketched in 2007. Supports root ($), child access (.key), recursive descent (..), wildcards ([*]), slices ([1:3]), and filter expressions ([?(@.price < 15)]). RFC 9535 standardised the dialect in February 2024 — this implementation follows the popular subset.

Status
ok
Matches
4
Path tokens
5

Path
Presets
JSON document
Matches
[
  "Foundation",
  "Dune",
  "Snow Crash",
  "The Left Hand of Darkness"
]

A long road to a real spec.

JSONPath began as a sketch. In February 2007, Stefan Goessner published a short blog post titled "JSONPath — XPath for JSON" that adapted the XPath idiom to JSON's tree shape and provided a roughly fifty-line JavaScript reference implementation alongside a parallel PHP version. The post was deliberately informal: it gave examples, a small grammar table, and a recursion-based evaluator, but it left a surprising amount unspecified. What does $..* mean when the document contains arrays inside arrays? Does a filter expression that throws on a missing key return false, or does it propagate the error? Is the slice [0:10:2] part of the language? Goessner's answer, more or less, was "do something reasonable."

That informality is why every JSONPath implementation you have ever used disagrees with at least one other. Christoph Burgmer's "JSONPath comparison" project, started in 2019 and maintained as part of the IETF working group's reference material, catalogues more than forty libraries across Python, Java, JavaScript, Go, Rust, and C# and tests them against a shared corpus of queries. The disagreements are not edge cases. Implementations differ on whether $.store.book[*].author returns an array or unwraps a single match, on whether negative slice indices count from the end or wrap around, on the precedence of && versus || inside filters, on whether the recursive-descent operator visits object keys, array indices, both, or only "values," and on whether @.price < 10 against a missing price is false, error, or silently true.

RFC 9535, published in February 2024 by Bormann, Bray, Gessner, Goessner himself, Newton, and Bormann, finally pins down a normative subset. It defines the abstract syntax, the evaluation semantics in terms of nodelists, the notion of a "normalized path" (a canonical form using only $, bracketed name selectors, and bracketed integer indices), and an extension mechanism for type-checked function calls. It is the document any new implementation should target. Even so, gaps remain. The RFC is silent on streaming evaluation, leaves the catalog of function extensions deliberately small (length, count, match, search, value), and notes but does not resolve compatibility-mode questions for older implementations.

A useful rule for JSONPath tooling

If you maintain a tool that accepts JSONPath from users, the single most useful thing you can do is print, alongside any result, the normalized path of every match. RFC 9535 §2.7 defines exactly what this should look like, and it eliminates an enormous class of "why did your query return that" support tickets.

Every operator, in detail.

The root selector $ anchors every JSONPath expression. Inside a filter, @ rebinds to the current candidate node, so $.items[?(@.price < 10)] reads as "starting at the root, descend into items, and for each element bind it to @ and keep it if its price is less than ten." The two never appear interchangeably outside a filter; a bare @ at the top level is a syntax error in RFC 9535, even though Goessner's original implementation tolerated it.

Member access has two equivalent forms. Dot notation, $.store.book, is concise and reads naturally, but it only works for keys matching [A-Za-z_][A-Za-z0-9_]*. Bracket notation, $['store']['book'], accepts any string, including keys with spaces, hyphens, dots, or Unicode. Mixing them is fine: $.config['kafka.bootstrap.servers'][0]. The recursive descent operator .. is the one piece of syntax most likely to surprise. $..author does not mean "any author key directly under root"; it means "every author key at any depth, in document order." On a deeply nested document this visits every node, which has performance consequences.

Wildcards come in two shapes. [*] enumerates array elements or object values; .* does the same in dot form. Slices borrow Python semantics: [a:b] is half-open, [a:b:c] adds a step, negative indices count from the end. [-3:] returns the last three elements; [::-1] reverses the array. Filter expressions live inside [?(...)] and support equality, ordering, regex match, boolean combinators, and parenthesisation. Comparisons against missing values produce a special "Nothing" value that propagates through expressions and is unequal to every defined value, including JSON null — a subtlety that distinguishes "key absent" from "key present with value null."

Function extensions in RFC 9535 are typed. length(@.items) returns an integer or Nothing; count($..book) returns the cardinality of a nodelist; match(@.sku, '^[A-Z]{3}-[0-9]+$') returns a boolean for full-string regex match, search() for substring; value() coerces a singleton nodelist to its scalar value so it can participate in comparisons. The type system rejects length(@) == 'three' at parse time rather than producing a confusing runtime result.

You've been writing JSONPath all along.

If you have written a Kubernetes manifest you have written JSONPath, even if you did not call it that. kubectl get pods -o jsonpath='{.items[*].metadata.name}' is the canonical example, and Kubernetes ships its own Go implementation that diverges from RFC 9535 in several documented ways, most notably the use of {} as expression delimiters rather than bare expressions. GitHub Actions evaluates if: conditions through a context-expression language that borrows the dot-and-bracket subset from JSONPath. GitLab CI's rules: clauses and Argo Workflows' when: conditions do the same.

Observability platforms lean on JSONPath for extraction. Datadog log pipelines use it in their grok and JSON parsing processors to pluck fields out of structured logs; Splunk's spath command is JSONPath under another name. API testing tools — Postman's pm.expect(pm.response.json()) chains, Karate's match assertions, REST Assured's body() matchers in Java — are all evaluating JSONPath-shaped expressions against response bodies. JSONata, a related language by Andrew Coleman first released in 2016, looks like JSONPath but is a full transformation language with its own type system and lambda functions; mistaking one for the other will burn an afternoon.

ToolVariantNotable divergence
kubectlKubernetes JSONPathbrace delimiters, range keyword, no filters
GitHub ActionsContext expressionssubset only, no recursive descent
GitLab CI rulesCEL-flavouredfilter syntax differs, no slices
Argo Workflowsexpr-baseduses Go's expr package
Datadog pipelinesGoessner-classicpre-RFC, regex via =~
Splunk spathPath-onlyno filters, no recursion
Postman / KarateGoessner-classiclibrary dependent, often Jayway
JSONataDistinct languagesuperset, transformations, lambdas

Three tools, three jobs.

The three are often listed together but they do different jobs. JSONPath is a read-only path query language: every expression selects a nodelist from a single input document and returns those nodes unchanged. It does not transform, it does not aggregate, it does not produce structures the input did not contain. jq, by Stephen Dolan, first released in 2012 and now at version 1.7.1 (2023), is a full programming language. It has pipes, variable bindings, user-defined functions, recursion, generators, mathematical operators, string formatting, regex, SQL-style group-by, reduce, foreach, and a module system. A jq program can take an input and emit something with completely different shape; JSONPath cannot.

JMESPath sits between them. Designed by James Saryerwinnie at AWS around 2014 and used as the query language for the AWS CLI's --query flag, the Boto3 SDK's paginators, and increasingly in Azure CLI and Terraform's for_each, it is read-only like JSONPath but adds projections, multi-select hashes, pipe expressions, and a small library of built-in functions. reservations[*].instances[?state.name=='running'].{Id: instanceId, Type: instanceType} is idiomatic JMESPath and would require either jq or post-processing in JSONPath.

Performance differs in kind, not just degree. jq compiles its program to bytecode for a stack VM, which means a complex pipeline pays a one-time compilation cost and then runs at near-C speed. JSONPath implementations are almost universally tree-walking interpreters that re-parse or at best AST-cache each expression, and recursive descent forces a full traversal. JMESPath has a simpler grammar than jq and a smaller surface than JSONPath-with-filters, so its interpreters tend to outrun JSONPath on equivalent queries despite both being interpreted.

A rule of thumb

If you can write the query as a fixed path with at most one filter, JSONPath or JMESPath is appropriate and will be readable to a generalist. If you find yourself wanting to "and then" — first select, then reshape, then group — you have outgrown path languages, and continuing to fight them in the name of consistency will produce code your future self cannot read.

Where implementations disagree.

Recursive descent through arrays is the spec's most contested corner. Given {"a": [{"b": 1}, {"b": 2}]}, does $..b return [1, 2]? Every implementation says yes. Does $..[0] return the first element of every array at every depth? RFC 9535 says yes; several pre-RFC libraries say it is a syntax error because they restricted .. to be followed by a name or wildcard. Filter evaluation order matters when filters have side effects in function extensions — RFC 9535 specifies left-to-right with short-circuit && and ||, but older Jayway-derived libraries evaluated eagerly and could throw on the right operand of a short-circuited &&.

Nested filters compound the @ problem. In $.orders[?(@.items[?(@.qty > 5)])], the inner @ should bind to each item, the outer @ to each order. RFC 9535 makes this lexical and unambiguous; Goessner's original prose was vague, and at least one widely deployed Python library got the binding wrong until 2022. Missing keys versus explicit null is another live grenade: @.deleted_at == null is true for an object with that field set to null but, under RFC 9535, false for an object missing the field entirely because the latter compares Nothing to null. Some libraries treat them as equal, with predictable bug reports.

Performance pitfalls cluster around ... A query like $..id on a 50 MB document with deeply nested arrays will visit every node, and if the implementation materialises intermediate nodelists rather than streaming, peak memory can exceed input size. Filters inside recursive descent are worse: $..[?(@.active)] runs the filter against every node, including scalars where the predicate is meaningless. Cycle detection matters because while strict JSON has no cycles, in-memory representations populated from YAML anchors or programmatic graph builders can. A naive .. implementation will recurse forever; a careful one tracks visited identities and breaks.

Where the path is the contract.

Kubernetes uses JSONPath for kubectl get -o jsonpath. GitHub Actions, GitLab CI, and Argo workflows use it for conditional steps. Datadog, New Relic, and Splunk pipelines use it to extract fields from event JSON. Postman and curl alternatives use it for response assertions. The advantage over jq is portability — JSONPath libraries exist in every language, the syntax is small, and the queries are read-only by design (no transformations, no aggregations). The disadvantage shows up when you need to shape output: filter and rename fields, compute aggregates. Reach for jq when the path becomes a transformation.

jq is a different tool

jq has its own path syntax (.store.books[].title) plus pipes, recursion, function definitions, and SQL-like aggregates. JSONPath is the lowest-common-denominator across ecosystems; jq is the power tool. Both have their place — even in the same pipeline.

Selectivity matters. JSONPath always parses the entire document into memory before evaluating, which is fine for kilobytes and painful for gigabytes. When the JSON is large but the target is a known path, JSON Pointer (RFC 6901) plus a streaming parser like simdjson or ijson will outperform JSONPath by one to two orders of magnitude, because Pointer expressions are deterministic and can be matched against a SAX event stream without ever materialising the tree. When the JSON arrives as a stream — newline-delimited JSON from Kafka, server-sent events, log tails — path queries are the wrong shape entirely; you want event-driven extraction, and tools like jq's stream mode, jaq, or hand-written SAX consumers exist precisely for this case.

Found this useful?