What is Elasticsearch?
Elasticsearch is a distributed, document-oriented search and analytics engine built on Apache Lucene, exposed entirely over a JSON REST API.
Why it matters
It answers full-text queries over billions of documents in milliseconds by precomputing an inverted-index at write time instead of scanning rows at read time. The same engine powers product search, log/observability backends, and ad-hoc analytics — which is why it sits behind a huge fraction of “search box” and “log dashboard” features in production.
How it works
You PUT JSON documents into an index; each index is split into shards (each shard is a self-contained Lucene index) spread across nodes for horizontal scale.
- Schema-aware — a mapping declares field types;
textfields are run through analyzers to build the inverted index,keyword/numeric fields stay exact. - Near real-time — writes are searchable after a refresh (default 1s), not instantly, because Lucene buffers into segments.
- Distributed by default — replicas provide HA and read throughput; a cluster coordinates allocation and failover.
- Relevance-ranked — results are scored by BM25, not just matched.
| Aspect | Elasticsearch |
|---|---|
| Data model | JSON documents in indices |
| Interface | REST + Query DSL (JSON) |
| Consistency | Near real-time, eventually consistent reads |
| Scaling | Horizontal via shards + replicas |
Example
PUT /products/_doc/1
{ "name": "Wireless Mouse", "price": 24.99, "in_stock": true }
GET /products/_search
{ "query": { "match": { "name": "mouse" } } }
The match query analyzes "mouse" the same way the field was indexed, looks it up in the inverted index, and returns doc 1 with a _score.
Pitfalls
- Treating it as a system of record — it has no transactions or joins; keep a source DB and index a denormalized view (see elasticsearch-vs-relational-databases).
- Expecting read-after-write — a doc isn’t searchable until the next refresh;
?refresh=trueforces it but hurts throughput. - Confusing the products — “Elasticsearch” is one component of the broader Elastic Stack (Kibana, Beats, Logstash).