Caching (query / request cache)

Elasticsearch keeps several distinct caches — node query cache, shard request cache, and the OS page cache — that together make repeated filters and aggregations cheap on warm data.

Why it matters

Dashboards and filtered searches re-run nearly identical queries constantly; recomputing the same filter bitset or aggregation every time wastes CPU and I/O. Knowing which cache serves what — and what invalidates it — is the difference between a snappy Kibana board and one that hammers the cluster on every refresh.

How it works

Three layers, each keyed and invalidated differently:

CacheScopeCachesInvalidated by
Node query cachePer node, LRUFilter-context bitsets per segmentsSegment merge/delete
Shard request cachePer shardWhole-request results (aggs, hits.total)Any refresh on the shard
OS page cachePer nodeRaw Lucene file blocksMemory pressure
  • Only filter context is query-cached — scored query-context clauses (BM25) aren’t cached because scores depend on the query; put cacheable predicates in filter/must_not.
  • Request cache requires size:0 by default — it shines for aggregations and hits.total where the hit list is empty; it’s keyed on the whole request JSON, so any difference (even key order) misses.
  • Refresh busts the request cache — on a frequently refreshing index the cache barely helps; it pays off on read-only cold indices where data is stable.
  • Heuristic admission — the node query cache only caches filters it sees reused and on segments above a size threshold, to avoid churn on tiny/rare filters.

Example

A dashboard runs the same range on @timestamp plus a terms agg every 30 s. Wrapped in filter with size:0 against a read-only warm index, the first call computes; subsequent calls hit the shard request cache and return in single-digit ms. Move the same query to today’s hot index (refreshing every 1 s) and the request cache misses almost every time — the agg recomputes.

Pitfalls

  • Putting cacheable predicates in query context — a date filter inside must (scored) skips the query cache; move it to filter.
  • Expecting caching on a hot index — per-second refresh invalidates the request cache continuously; cache benefit lives on stable data.
  • now in range filtersgte: now-1h rounds to the millisecond, so the cache key changes constantly; use now-1h/h rounding to make it cacheable.
  • Sizing heap for caches — node query cache defaults to ~10% of heap and request cache ~1%; raising them blindly steals heap from indexing and search.

See also