Use cases (search, logging, observability, analytics)

The four workloads Elasticsearch is bought for — and how their data shape and access patterns differ enough to change how you configure the cluster.

Why it matters

“Search” and “logs” pull a cluster in opposite directions: search is read-heavy with mutable, low-volume documents; logs are write-heavy, append-only, and time-partitioned. Picking the wrong sharding or lifecycle model for the workload is the most common cause of a cluster that’s slow or runs out of disk.

How it works

The same engine, configured differently per workload.

  • Site/app search — relatively static documents, BM25 relevance matters, plus autocomplete and faceting. A few well-sized indices.
  • Logging — append-only events keyed by @timestamp; use data streams + ILM to roll over and age out.
  • Observability / APM — logs + metrics + traces unified; high cardinality, retention tiers via hot-warm-cold.
  • Analytics / BI — heavy metric and pipeline aggregations over large windows; runtime fields for ad-hoc dimensions.
WorkloadWrite patternKey featureLifecycle
SearchMutable, low volumeBM25, autocompleteManual reindex
LoggingAppend-only, high volumeData streamsILM rollover
ObservabilityMixed signalsCorrelationTiered retention
AnalyticsBulk loadsAggregationsSnapshot/archive

Example

A logging index sized for ~50 GB/day with 7-day hot retention:

data stream "logs-app-prod"
  ILM: rollover at 50GB or 1d  ──▶ hot (7d) ──▶ warm (30d) ──▶ delete
  shard target ≈ 30–50 GB each

Compare a product-search index: 1 primary + 1 replica, tuned analyzers, rebuilt via reindex on mapping changes.

Pitfalls

  • One config for all — over-sharding a search index or under-using ILM for logs both end badly.
  • High-cardinality aggregationsterms/cardinality over millions of unique values blows up memory; bound them.
  • Time-series without rollover — a single ever-growing index becomes unmanageable; use data streams from day one.

See also