34% of API costs come from rate limit retries and connection overhead. Every request has a price tag — not just compute time, but TCP handshakes, TLS negotiation, header processing, and retry logic. The gateway choices you make determine whether these costs help protect your backend or hurt your budget.

The Unseen Cost Layers

API gateway costs aren't just requests × price. They're:

  • Connection overhead: TCP handshake (3 packets), TLS negotiation (8-12 packets), then your actual payload
  • Rate limit enforcement: Token bucket checks, sliding window tracking, per-endpoint quotas
  • Retry logic: Exponential backoff adds latency and multiplies request count
  • Header inspection: Auth token parsing, tracing headers, rate limit metadata

These add up. A simple 100ms API call might be 40ms actual work, 60ms overhead. At 1M requests/day, that's 40 hours of pure overhead.

Gateways Compared

Three common patterns:

GatewayMax RateOverhead/reqPrice/1M reqs
nginx (layer 7)50K/s2ms$15
AWS API Gateway10K/s8ms$3.50
Kong30K/s4ms$20

Nginx wins on speed, AWS wins on simplicity, Kong wins on flexibility. Your choice should match your traffic profile, not your preferences.

Connection Pooling Saves Money

HTTP/2 and HTTP/3 multiplexing reduces connection overhead. Each new TCP connection requires:

  • SYN packet (client → server)
  • SYN-ACK (server → client)
  • ACK (client → server)
  • TLS handshake (8-12 packets, depending on tickets)

That's 11-15 packets just to open the door before sending real data. Connection pooling keeps the door open, reusing the handshake for subsequent requests.

At 10K requests/second, connection pooling can reduce packet count by 70% — and packet processing is where most gateway CPU time goes.

Decision Matrix

Ask these questions:

QuestionChoose Low OverheadChoose High Features
Traffic volume10K+ req/s1K-10K req/s
Need rate limiting?NoYes
Authentication complexityBasic (API keys)Advanced (JWT, OAuth)
Need tracing/metrics?NoYes

One Last Thing

API gateways aren't free insurance. They're active cost centers — and the architecture decisions determine whether they protect your backend or drain your budget.

Start with the traffic pattern. Then match gateway capabilities to actual needs. Then measure. The cheapest gateway is the one you don't need — but if you do need one, make sure it's the right one for your scale.