Rate Limiting

Overview

AppKernel ships with built-in rate limiting to protect against brute-force login attempts, credential stuffing, and API enumeration. The throttle runs as a Starlette middleware that sits in front of the security middleware, stopping excessive traffic before JWT validation is even attempted.

The implementation uses a fixed-window counter per client IP and endpoint group. All state is held in-process — no external dependency is required. See Multi-Instance Deployments if you run more than one instance behind a load balancer.

Quick Start

Enable rate limiting after registering security:

from appkernel import AppKernelEngine

kernel = AppKernelEngine('my-app', cfg_dir='./config')
kernel.enable_security()       # add JWT/RBAC middleware
kernel.enable_rate_limiting()  # add rate-limit middleware (runs first)
kernel.register(User, methods=['GET', 'POST', 'PUT', 'DELETE'])
kernel.run()

With the defaults, each client IP is allowed 100 requests per 60-second window across the entire API surface. Requests that exceed the limit receive:

HTTP 429 Too Many Requests
Retry-After: 43

{
    "_type": "ErrorMessage",
    "code": 429,
    "message": "Too many requests. Please slow down and retry after the indicated delay."
}

The Retry-After value is the number of seconds remaining in the current window.

Important

Always call enable_rate_limiting() after enable_security(). Starlette applies middlewares in reverse registration order (last added = outermost = first to execute). Adding rate limiting last ensures it runs before authentication, so brute-force attempts are stopped without incurring the cost of JWT validation.

Configuration Reference

Pass a RateLimitConfig instance to customise behaviour:

from appkernel import AppKernelEngine, RateLimitConfig

kernel.enable_rate_limiting(
    RateLimitConfig(
        requests_per_window=100,     # global limit per client IP
        window_seconds=60,           # window length in seconds
        endpoint_limits={},          # per-prefix overrides (see below)
        exclude_paths=[],            # paths that bypass limiting
        trust_proxy_headers=False,   # honour X-Forwarded-For
    )
)

Parameter

Default

Description

requests_per_window

100

Maximum requests a single IP may make within window_seconds. Exceeded requests receive HTTP 429.

window_seconds

60

Duration of the counting window. The counter resets when the window expires — it does not slide continuously.

endpoint_limits

{}

Per-path-prefix overrides. First matching prefix wins. See Per-Endpoint Limits.

exclude_paths

[]

Path prefixes that bypass rate limiting entirely. See Excluding Paths.

trust_proxy_headers

False

When True, the client IP is read from X-Forwarded-For rather than the TCP peer address. Enable only behind a trusted reverse proxy. See Proxy and Load Balancer Deployments.

Recommended profiles:

Traffic

requests_per_window

window_seconds

Low / auth

10–20

60

Medium

100

60 (default)

High

500

60

Per-Endpoint Limits

Authentication endpoints typically need tighter limits than the general API. Use endpoint_limits to override the global limit for specific path prefixes:

kernel.enable_rate_limiting(
    RateLimitConfig(
        requests_per_window=200,          # generous global limit
        endpoint_limits={
            '/auth': 10,                  # brute-force protection
            '/users/change_password': 5,  # password-reset protection
            '/admin': 20,                 # admin surface
        }
    )
)

The first matching prefix wins, so order matters for overlapping prefixes. Requests whose path does not match any prefix fall back to requests_per_window.

Excluding Paths

Health checks, readiness probes, and metrics endpoints should not be rate limited as they are called by infrastructure at high frequency:

kernel.enable_rate_limiting(
    RateLimitConfig(
        exclude_paths=['/health', '/ready', '/metrics']
    )
)

Prefix matching is used — '/health' excludes /health, /health/live, and /healthz.

Proxy and Load Balancer Deployments

When AppKernel runs behind a reverse proxy (nginx, AWS ALB, Cloudflare), the TCP peer address seen by the application is the proxy IP, not the real client IP. All requests would share a single rate-limit bucket, making the throttle ineffective.

Set trust_proxy_headers=True to read the real IP from the first address in the X-Forwarded-For header:

kernel.enable_rate_limiting(
    RateLimitConfig(trust_proxy_headers=True)
)

Warning

Only enable trust_proxy_headers when AppKernel sits behind a proxy that you control and that strips or overwrites X-Forwarded-For. If the header can be set by end users, an attacker can forge any IP and trivially bypass per-IP limits.

Multi-Instance Deployments

The default limiter stores all counters in the memory of the running process. If you run multiple AppKernel instances behind a load balancer, each instance tracks its own counters independently — a client could hit every instance at the configured limit, effectively multiplying their allowed throughput by the number of instances.

For multi-instance deployments, replace the in-process limiter with a Redis-backed implementation. The middleware accepts any object that implements the same check(request) -> (allowed, retry_after) interface as RateLimiter:

from appkernel.rate_limit import RateLimitConfig, RateLimitMiddleware

class RedisRateLimiter:
    def __init__(self, redis_client, cfg: RateLimitConfig):
        self._redis = redis_client
        self._cfg = cfg

    def check(self, request) -> tuple[bool, int]:
        # implement sliding window in Redis using INCR + EXPIRE
        ...

limiter = RedisRateLimiter(redis_client, RateLimitConfig(requests_per_window=100))
kernel.app.add_middleware(RateLimitMiddleware, limiter=limiter)