Bulk indexing tuning

The set of settings and client behaviors that maximize sustained write throughput when loading large volumes via the Bulk API — concurrency, batch size, refresh, and replicas.

Why it matters

Default indexing is tuned for near real-time search, not bulk loads; a naive reindex of billions of docs can take days or trip 429s. The right knobs routinely yield 5–20× throughput. The catch: most of them trade search freshness or durability for speed, so they belong on initial loads and reindexes, not steady-state ingest.

How it works

Throughput is bounded by the slowest of: client concurrency, segment refresh churn, replication, and merge I/O.

Knob	Default	Bulk-load value	Effect
Batch size	—	5–15 MB per request	Amortize round-trips
Concurrent requests	1	≈ number of data nodes	Saturate write threads
`refresh_interval`	1s	`-1` during load	Stop per-second segment churn
`number_of_replicas`	1	`0` during load	Skip replicating every write

Size batches by bytes, not docs — 5–15 MB is the sweet spot; bigger batches stress the coordinating node’s heap and risk 429.
Find concurrency empirically — ramp parallel bulk requests until throughput plateaus or rejections appear; the write thread pool queue is finite.
Disable refresh and replicas during the load, then restore them and let one refresh + replication catch up — far cheaper than doing both per batch.
Use auto-generated IDs when possible — providing an _id forces a “does this id already exist?” lookup across segments; auto IDs skip it.

Example

Load 1 B docs. Set refresh_interval:-1, number_of_replicas:0. Run 4 client threads (matching 4 data nodes), ~10 MB batches, auto IDs. On 429, exponential backoff (not faster). After the load: number_of_replicas:1, refresh_interval:1s, one force_merge to few segments. Net: a load that took ~10 h at defaults finishes in ~1 h.

Pitfalls

Leaving replicas:0 permanently — great for loading, but a node loss now means data loss; always restore replicas before going live.
Hammering through 429 — rejections mean the write queue is full; retrying faster makes it worse, back off per-item.
Over-large batches — multi-hundred-MB bulks cause heap pressure and GC pauses, lowering throughput.
Client bottleneck — a single-threaded loader on a multi-node cluster leaves most write capacity idle; parallelize the client too.

tech-studies

Explorer

Bulk indexing tuning

Bulk indexing tuning

Why it matters

How it works

Example

Pitfalls

See also

Graph View

Table of Contents

Backlinks