Ideas Engineered for Tomorrow
We Engineer Services & Solutions for Your Business Needs
Home About
Products
Services
Hire
Industries
Consulting
Partners
Articles Careers Contact
Software Development

API Gateway Patterns for Microservices Architecture

Your mobile app makes 12 API calls on one screen, each to a different microservice. An API gateway turns that into a single call — and handles auth, rate limiting, and caching along the way.

🔌 Architecture January 19, 2026 12 min read

In This Guide

An API gateway sits between your clients and your microservices. It's a single entry point that handles cross-cutting concerns — authentication, rate limiting, request routing, response caching, and protocol translation. Without it, every microservice must implement these concerns independently, and every client must know how to reach every service. We moved a client from direct service-to-service calls to a Kong gateway, and their mobile app went from 14 API calls per screen to 2, with 60% lower latency because the gateway handles aggregation server-side.

1. What an API Gateway Does (and Doesn't Do)

Gateway Responsibility Without Gateway With Gateway
AuthenticationEvery service validates JWT tokensGateway validates once, forwards user context
Rate limitingEach service implements its own limitsCentralized rate limiting per client/API key
Request routingClient knows every service URLClient hits one URL, gateway routes internally
Response cachingEach service manages its own cache headersGateway caches responses, reducing backend load
Protocol translationClient must speak gRPC, GraphQL, RESTClient speaks REST; gateway translates to gRPC internally
Request aggregationClient makes N calls for one screenGateway composites responses from multiple services

What a gateway should NOT do:

2. Gateway Patterns: Edge, BFF, and Mesh

Edge Gateway (Single Entry Point)

One gateway handles all external traffic. It's the simplest pattern and works well when you have a single client type (just a web app) or when all clients need the same API shape.

Edge Gateway Pattern:

 Web App ──┐
            ├──▶ API Gateway ──▶ [User Service]
 Mobile ───┤                  ├──▶ [Order Service]
            │                  ├──▶ [Product Service]
 3rd Party ┘                  └──▶ [Payment Service]

Backend for Frontend (BFF)

A separate gateway per client type. The mobile BFF returns smaller payloads and fewer images. The web BFF returns richer data. The admin BFF has different auth requirements. This is the pattern we use most — different clients have fundamentally different needs.

BFF Pattern:

 Web App ────▶ Web BFF ────────┐
               (rich payloads)  ├──▶ [User Service]
                               ├──▶ [Order Service]
 Mobile ─────▶ Mobile BFF ─────┤     ...
               (slim payloads)  │
                               │
 Admin ──────▶ Admin BFF ──────┘
               (internal APIs)

Service Mesh (Internal Gateway)

A service mesh (Istio, Linkerd) adds gateway-like functionality to internal service-to-service communication — mTLS, retries, circuit breakers, observability. It's a sidecar proxy next to every service, not a central gateway. Use this when you have 20+ microservices and need fine-grained traffic control internally.

Our recommendation: Start with a single edge gateway. Add BFF gateways when mobile and web teams diverge. Add a service mesh only when internal traffic management becomes painful (usually 20+ services). Most teams don't need Istio — it adds significant operational complexity. A simple HTTP client library with retries and circuit breakers covers 80% of what a mesh does.

3. Gateway Solutions Compared

Solution Type Best For Cost Complexity
KongOpen-source / EnterprisePlugin ecosystem, Kubernetes-nativeFree (OSS) / $35K+/yrMedium
AWS API GatewayManaged (serverless)AWS-native, Lambda integration$3.50/M requestsLow
Nginx + LuaSelf-hostedHigh performance, full controlFree (ops cost)High
TraefikOpen-sourceDocker/K8s auto-discovery, Let's EncryptFree (OSS)Low-Medium
EnvoyOpen-source (CNCF)Service mesh sidecar, gRPC-nativeFree (ops cost)High
Express/Fastify (custom)Build your ownBFF pattern, custom aggregationDev timeMedium

4. Routing and Request Transformation

The gateway maps public API paths to internal service URLs. This decouples your public API contract from your internal service structure — you can split, merge, or rename services without changing the public API.

# Kong declarative config (kong.yml)
services:
  - name: user-service
    url: http://user-svc.internal:3001
    routes:
      - name: users-api
        paths: ["/api/v1/users"]
        strip_path: true
        methods: [GET, POST, PUT, DELETE]

  - name: order-service
    url: http://order-svc.internal:3002
    routes:
      - name: orders-api
        paths: ["/api/v1/orders"]
        strip_path: true

  - name: product-service
    url: http://product-svc.internal:3003
    routes:
      - name: products-api
        paths: ["/api/v1/products"]
        strip_path: true

# Client calls: api.example.com/api/v1/orders/123
# Gateway routes to: order-svc.internal:3002/123

Request/Response Transformation

Gateways can modify requests and responses in flight. Common transformations:

5. Authentication at the Gateway

Gateway authentication validates tokens once and forwards user context to downstream services. This eliminates redundant validation and centralizes your auth logic.

Auth flow through gateway:

Client ──▶ Gateway ──▶ Downstream Service

1. Client sends: Authorization: Bearer eyJhbG...
2. Gateway validates JWT (signature, expiry, issuer)
3. Gateway extracts claims: { userId: "u_123", role: "admin" }
4. Gateway forwards:
   X-User-ID: u_123
   X-User-Role: admin
   X-Request-ID: req_abc123
5. Downstream trusts these headers (internal network only)
6. Downstream skips JWT validation entirely
# Kong JWT plugin configuration
plugins:
  - name: jwt
    config:
      key_claim_name: iss
      claims_to_verify: [exp]
      header_names: [Authorization]

  - name: request-transformer
    config:
      add:
        headers:
          - "X-User-ID:$(jwt.payload.sub)"
          - "X-User-Role:$(jwt.payload.role)"

# Public endpoints (no auth required)
  - name: public-routes
    paths: ["/api/v1/health", "/api/v1/products"]
    plugins:
      - name: jwt
        enabled: false
Auth Strategy When to Use Gateway Handles
JWT validationStateless, most commonSignature check, expiry, forwarding claims
API keyMachine-to-machine, B2BKey lookup, rate limit per key
OAuth2 / OIDCSocial login, SSOToken exchange, redirect flows
mTLSService-to-serviceCertificate validation, service identity

6. Rate Limiting and Throttling

Rate limiting protects your services from abuse and ensures fair usage. The gateway is the natural place for it — one central enforcement point instead of every service implementing its own.

Algorithm How It Works Best For
Fixed Window100 requests per minute, counter resets at :00Simple, but allows bursts at window boundaries
Sliding WindowWeighted average of current and previous windowsSmooth limiting, no burst at boundaries
Token BucketBucket fills at fixed rate; each request takes a tokenAllows controlled bursts (bucket can hold N tokens)
Leaky BucketRequests queue up and drain at a fixed rateSmooth output rate, good for downstream protection
# Kong rate limiting plugin
plugins:
  - name: rate-limiting
    config:
      minute: 60           # 60 requests per minute
      hour: 1000           # 1000 per hour
      policy: redis        # shared counter across gateway instances
      redis_host: redis.internal
      limit_by: credential # per API key (or: ip, consumer, header)
      hide_client_headers: false

# Response headers clients receive:
# X-RateLimit-Limit-Minute: 60
# X-RateLimit-Remaining-Minute: 42
# Retry-After: 18 (when limit exceeded)

7. Circuit Breakers and Resilience

When a downstream service is failing, the gateway should fail fast instead of waiting for timeouts. A circuit breaker tracks failure rates and "opens" when too many requests fail — subsequent requests return immediately with an error instead of queuing up and exhausting connections.

Circuit Breaker States:

┌──────────┐  failures > threshold  ┌──────────┐
│ CLOSED │ ──────────────────────▶│ OPEN │
│ (normal) │                       │ (reject) │
└──────────┘                       └────┬─────┘
     ▲                                 │
     │ success                 after timeout│
     │                                 ▼
     │                 ┌───────────────┐
     └──────────────── │ HALF-OPEN │
                     │ (test 1 req) │
                     └───────────────┘

Resilience patterns we implement at the gateway layer:

8. Implementation: Kong + Nginx

Here's the production gateway stack we deploy for most clients: Kong (built on Nginx/OpenResty) running in Docker with PostgreSQL for configuration storage and Redis for rate limiting state.

# docker-compose.yml — Kong API Gateway
version: '3.8'
services:
  kong-db:
    image: postgres:15
    environment:
      POSTGRES_DB: kong
      POSTGRES_USER: kong
      POSTGRES_PASSWORD: ${KONG_DB_PASSWORD}
    volumes:
      - kong_data:/var/lib/postgresql/data

  kong:
    image: kong:3.6
    environment:
      KONG_DATABASE: postgres
      KONG_PG_HOST: kong-db
      KONG_PG_PASSWORD: ${KONG_DB_PASSWORD}
      KONG_PROXY_LISTEN: "0.0.0.0:8000, 0.0.0.0:8443 ssl"
      KONG_ADMIN_LISTEN: "0.0.0.0:8001"
      KONG_LOG_LEVEL: info
    ports:
      - "80:8000"
      - "443:8443"
    depends_on: [kong-db, redis]

  redis:
    image: redis:7-alpine
    command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru
Our standard plugin stack: JWT auth + rate-limiting (Redis-backed) + request-transformer (add headers) + cors + prometheus (metrics) + file-log (structured JSON). This covers 95% of gateway needs without custom code. The remaining 5% we handle with custom Lua plugins or a thin BFF service.

9. Frequently Asked Questions

Isn't an API gateway a single point of failure?

Yes, if you run a single instance. In production, run at least two gateway instances behind a cloud load balancer (ALB, NLB) with health checks. Kong and Nginx handle this natively — multiple instances share the same configuration database. We typically run 3 gateway instances across availability zones for production workloads.

Should I use GraphQL as my API gateway?

GraphQL federation (Apollo Gateway) can work as a gateway layer — it stitches schemas from multiple services into one graph. This works well when your frontend team wants flexible queries. But it adds complexity: schema stitching, query planning, and performance concerns with deeply nested queries. We use it for B2C products where frontend flexibility matters. For B2B APIs, REST with an edge gateway is simpler and better documented.

How much latency does an API gateway add?

Typically 1-5ms for routing and authentication. Kong adds about 1-2ms per request for basic routing. JWT validation adds another 1-2ms. Rate limiting with Redis adds under 1ms. The total overhead is negligible compared to the 50-200ms your services spend on business logic and database queries. The latency savings from response caching at the gateway usually more than offset the added hop.

Do I need an API gateway if I only have a monolith?

Probably not a full gateway, but a reverse proxy (Nginx, Caddy) gives you most benefits with less complexity: SSL termination, static file serving, gzip compression, rate limiting, and basic auth. When you split your monolith into services later, upgrading the reverse proxy to a full gateway (Kong, Traefik) is straightforward.

PI
Pillai Infotech Engineering Team

We've deployed API gateways for clients ranging from early-stage startups (Nginx reverse proxy) to enterprise platforms (Kong clusters handling 50K+ RPS). Our approach: start with the simplest solution that works, add complexity when the metrics demand it.

Related Articles

System Design: Architecture Principles for Scalable Systems → Microservices vs Monolith: When to Make the Switch → API Security Best Practices for 2026 →