Normalizers

A normalizer is a stripped-down analyzer for keyword fields: it applies character and token filters but produces exactly one token, so the value stays exact while gaining case- and accent-insensitive matching.

Why it matters

keyword fields are matched verbatim, so "ACTIVE" won’t equal "active" and aggregation buckets split by case. You can’t attach a regular analyzer to a keyword (it would tokenize). A normalizer solves this: "Café" and "cafe" collapse to one bucket and match the same term query — without turning the field into searchable text.

How it works

A normalizer runs char_filters and a restricted set of single-token filters; no tokenizer is allowed, guaranteeing one token out.

Allowed filters — only those that don’t split tokens: lowercase, uppercase, asciifolding, trim, mapping, pattern_replace, and similar.
Not allowed — stemmer, synonym, ngram, or any tokenizer (would emit multiple tokens).
Applied at both index and query time, so a query term is normalized to match the stored term.
Built-ins — only the lowercase normalizer ships predefined; anything else is a custom normalizer in index settings.
Aggregations — buckets key off the normalized token, so case variants merge.

Field setup	Stored term for “Café”	`term: "cafe"` matches?
`keyword`, no normalizer	`Café`	No
`keyword` + `lowercase` norm	`café`	No (accent)
`keyword` + `lowercase`,`asciifolding`	`cafe`	Yes

Example

PUT /users
{ "settings": { "analysis": { "normalizer": {
    "lc_fold": { "type": "custom", "filter": ["lowercase", "asciifolding"] } } } },
  "mappings": { "properties": {
    "country": { "type": "keyword", "normalizer": "lc_fold" } } } }

POST /users/_doc { "country": "Perú" }
GET  /users/_search { "query": { "term": { "country": "peru" } } }   // matches

The stored term is peru; a terms aggregation now reports one peru bucket instead of splitting Perú/peru/PERU.

Pitfalls

Using a tokenizer — settings are rejected; normalizers forbid tokenizers and multi-token filters.
Changing it later — like analyzers, a new normalizer only affects new docs; existing values need a reindex.
match vs term — even normalized, query with term (or match which also normalizes); a raw mismatch still fails.
Highlighting — keyword + normalizer highlights the whole field, not sub-spans.

tech-studies

Explorer

Normalizers

Normalizers

Why it matters

How it works

Example

Pitfalls

See also

Graph View

Table of Contents

Backlinks