Elasticsearch vs relational databases
How a search engine differs from an RDBMS — the data model, query model, and consistency guarantees — and why the two are usually deployed together, not as substitutes.
Why it matters
Engineers reach for Elasticsearch expecting Postgres with a search box, then hit missing joins, no transactions, and stale reads. Knowing the boundaries up front decides what stays in your relational source of truth and what you denormalize into an index for fast search and aggregation.
How it works
A relational database normalizes data into tables and joins at query time; Elasticsearch denormalizes into self-contained documents and resolves everything at index time. This shifts the cost from read to write and powers fast analytics over flat data.
- No joins — model relations by denormalizing, nested documents, or
joinfield (parent/child, expensive). Prefer flattening. - No multi-document transactions — only single-doc atomicity, with optimistic concurrency via
_seq_no/_primary_term. - Schema — a mapping is similar to a schema, but
textfields are analyzed into an inverted-index; there is noALTERto change a field’s type. - Reads — near real-time and eventually consistent across replicas, not read-after-write.
| Relational DB | Elasticsearch | |
|---|---|---|
| Unit | Row in a table | JSON document in an index |
| Joins | Native | None (denormalize) |
| Transactions | ACID, multi-row | Single-doc only |
| Strength | Integrity, updates | Full-text + ranking, analytics |
| Consistency | Immediate | Eventual, NRT |
Example
A SELECT * FROM orders JOIN users becomes a denormalized doc:
RDBMS (normalized): orders(user_id FK) ── users(id, name)
Elasticsearch (flat): { "order_id": 5, "user": { "id": 7, "name": "Ada" } }
You accept duplication of user.name across orders to get a join-free, instantly searchable document.
Pitfalls
- Using it as the system of record — no backups-by-transaction, no constraints; keep the authoritative copy in a DB and reindex from it.
- Frequent partial updates — every update rewrites and re-indexes the whole document; heavy mutable data fits an RDBMS better.
- Modeling many-to-many with
join— it forces parent/child on one shard and scales poorly; denormalize instead.