Complex Data Structures

The specialized structures — union-find, Fenwick/segment trees, balanced trees, tries, suffix structures — that turn an O(n) per-query problem into O(log n) or O(α(n)) by precomputing structure over the data.

Why it matters

Once a problem asks “answer many queries over a changing collection,” a flat arrays-and-dynamic-arrays or hash-tables forces O(n) per query and the total blows up. These structures pay an O(n) or O(n log n) build cost once, then serve each query in sub-linear time. They are the backbone of databases (B-tree indexing), version control (Merkle/suffix-trees), networking routers (tries), and nearly every hard competitive problem. Picking the right one is a modeling skill, not memorization.

How it works

The choice is driven by the query type, not the data. Match the access pattern to the structure:

Need	Structure	Query	Build
Connectivity / merging sets	disjoint-set-union-find	~O(α(n))	O(n)
Prefix sum + point update	fenwick-trees	O(log n)	O(n)
Range query + range update	segment-trees	O(log n)	O(n)
Ordered keys, balanced	avl-trees, red-black-trees	O(log n)	O(n log n)
Prefix / dictionary lookup	tries	O(L)	O(ΣL)
Substring / pattern search	suffix-arrays, suffix-trees	O(m log n)/O(m)	O(n)–O(n log n)

Most reduce a problem to a tree over index ranges or value ranges, so a query touches O(log n) nodes instead of n elements.
The recurring trade is build cost vs query cost: amortize the build only when queries are many.
Static data unlocks cheaper variants (sorted arrays-and-dynamic-arrays + searching beats a tree when nothing changes).

Example

10⁶ “is u connected to v?” queries on a graph with edges added over time. A BFS/graph-algorithms per query is ~10⁶ × 10⁶ = 10¹² ops — minutes to hours. disjoint-set-union-find answers each in near-constant time: ~10⁶ × 5 ≈ 5×10⁶ ops, milliseconds. Same problem, six orders of magnitude, purely from structure choice.

Pitfalls

Reaching for a heavy structure when sorted-array + searching suffices. If data is static, you often do not need a tree at all.
Ignoring the constant factor. A segment tree has ~4n nodes and pointer/recursion overhead; a Fenwick tree is ~10× simpler and faster when it fits the problem.
Building per query instead of once. The O(n) build must be hoisted out of the query loop or the asymptotic win evaporates.
Wrong structure for the update kind. Fenwick handles point updates trivially but range updates need a trick; default to segment trees when both ranges move.

tech-studies

Explorer

Complex Data Structures

Complex Data Structures

Why it matters

How it works

Example

Pitfalls

See also

Graph View

Table of Contents

Backlinks