API gateway (routing, aggregation, offloading)

An API gateway is a single entry point in front of a set of services that routes requests, composes responses, and handles cross-cutting concerns so the services don’t have to.

Why it matters

Without a gateway, every client must know each service’s address, re-implement auth and rate limiting, and make many round-trips to assemble one screen. A gateway gives clients one stable endpoint, centralizes auth, TLS, and rate limiting, and shrinks chatty mobile calls into one. It also decouples the public API from internal service topology, so you can split or rename services behind it without breaking clients.

How it works

The gateway is a specialized reverse proxy with three core jobs:

CapabilityWhat it does
RoutingMaps an external path to an internal service, often via service-discovery
AggregationFans out to several services and composes one response
OffloadingPulls cross-cutting work (auth, TLS, caching) out of services

Routing matches on path, host, or headers and forwards (e.g. /orders/* → order service). Aggregation turns N client round-trips into one: the gateway calls several services in parallel and merges the result, ideal for mobile over high-latency links. Offloading centralizes concerns every service would otherwise duplicate — TLS termination, authentication and rate limiting, response caching, request logging, and throttling.

For different consumers, the Backend-for-Frontend (BFF) variant runs a tailored gateway per client type (web, mobile, partner), so one client’s needs don’t bloat another’s API. Unlike a plain L4 load balancer, a gateway works at L7 and understands the request payload.

Example

GET /dashboard  (one mobile call)
        │
   API gateway ── verify JWT, check rate limit
        ├─→ user-svc      GET /users/42
        ├─→ order-svc     GET /orders?user=42   (parallel)
        └─→ notif-svc     GET /unread?user=42
        │
   merge → { user, orders, unread }  → single 200 response

The phone makes one request over a slow link instead of three; auth runs once at the edge.

Pitfalls

  • God gateway — piling business logic into the gateway recreates a monolith and a deploy bottleneck; keep it to routing and cross-cutting concerns.
  • Single point of failure — one gateway fronts everything, so it must be replicated behind a load balancer.
  • Added latency — every request gains a hop; mitigate with caching and parallel aggregation, not sequential calls.
  • Coupling clients to internal shape — leaking service structure through the gateway defeats the decoupling it exists to provide.

See also