Rate Limiting¶
Overview¶
AppKernel ships with built-in rate limiting to protect against brute-force login attempts, credential stuffing, and API enumeration. The throttle runs as a Starlette middleware that sits in front of the security middleware, stopping excessive traffic before JWT validation is even attempted.
The implementation uses a fixed-window counter per client IP and endpoint group. All state is held in-process — no external dependency is required. See Multi-Instance Deployments if you run more than one instance behind a load balancer.
Quick Start¶
Enable rate limiting after registering security:
from appkernel import AppKernelEngine
kernel = AppKernelEngine('my-app', cfg_dir='./config')
kernel.enable_security() # add JWT/RBAC middleware
kernel.enable_rate_limiting() # add rate-limit middleware (runs first)
kernel.register(User, methods=['GET', 'POST', 'PUT', 'DELETE'])
kernel.run()
With the defaults, each client IP is allowed 100 requests per 60-second window across the entire API surface. Requests that exceed the limit receive:
HTTP 429 Too Many Requests
Retry-After: 43
{
"_type": "ErrorMessage",
"code": 429,
"message": "Too many requests. Please slow down and retry after the indicated delay."
}
The Retry-After value is the number of seconds remaining in the current
window.
Important
Always call enable_rate_limiting() after enable_security().
Starlette applies middlewares in reverse registration order (last added =
outermost = first to execute). Adding rate limiting last ensures it runs
before authentication, so brute-force attempts are stopped without incurring
the cost of JWT validation.
Configuration Reference¶
Pass a RateLimitConfig instance to customise behaviour:
from appkernel import AppKernelEngine, RateLimitConfig
kernel.enable_rate_limiting(
RateLimitConfig(
requests_per_window=100, # global limit per client IP
window_seconds=60, # window length in seconds
endpoint_limits={}, # per-prefix overrides (see below)
exclude_paths=[], # paths that bypass limiting
trust_proxy_headers=False, # honour X-Forwarded-For
)
)
Parameter |
Default |
Description |
|---|---|---|
|
100 |
Maximum requests a single IP may make within |
|
60 |
Duration of the counting window. The counter resets when the window expires — it does not slide continuously. |
|
|
Per-path-prefix overrides. First matching prefix wins. See Per-Endpoint Limits. |
|
|
Path prefixes that bypass rate limiting entirely. See Excluding Paths. |
|
|
When |
Recommended profiles:
Traffic |
requests_per_window |
window_seconds |
|---|---|---|
Low / auth |
10–20 |
60 |
Medium |
100 |
60 (default) |
High |
500 |
60 |
Per-Endpoint Limits¶
Authentication endpoints typically need tighter limits than the general API.
Use endpoint_limits to override the global limit for specific path prefixes:
kernel.enable_rate_limiting(
RateLimitConfig(
requests_per_window=200, # generous global limit
endpoint_limits={
'/auth': 10, # brute-force protection
'/users/change_password': 5, # password-reset protection
'/admin': 20, # admin surface
}
)
)
The first matching prefix wins, so order matters for overlapping prefixes.
Requests whose path does not match any prefix fall back to
requests_per_window.
Excluding Paths¶
Health checks, readiness probes, and metrics endpoints should not be rate limited as they are called by infrastructure at high frequency:
kernel.enable_rate_limiting(
RateLimitConfig(
exclude_paths=['/health', '/ready', '/metrics']
)
)
Prefix matching is used — '/health' excludes /health, /health/live,
and /healthz.
Proxy and Load Balancer Deployments¶
When AppKernel runs behind a reverse proxy (nginx, AWS ALB, Cloudflare), the TCP peer address seen by the application is the proxy IP, not the real client IP. All requests would share a single rate-limit bucket, making the throttle ineffective.
Set trust_proxy_headers=True to read the real IP from the first address in
the X-Forwarded-For header:
kernel.enable_rate_limiting(
RateLimitConfig(trust_proxy_headers=True)
)
Warning
Only enable trust_proxy_headers when AppKernel sits behind a proxy that
you control and that strips or overwrites X-Forwarded-For. If the header
can be set by end users, an attacker can forge any IP and trivially bypass
per-IP limits.
Multi-Instance Deployments¶
The default limiter stores all counters in the memory of the running process. If you run multiple AppKernel instances behind a load balancer, each instance tracks its own counters independently — a client could hit every instance at the configured limit, effectively multiplying their allowed throughput by the number of instances.
For multi-instance deployments, replace the in-process limiter with a
Redis-backed implementation. The middleware accepts any object that implements
the same check(request) -> (allowed, retry_after) interface as
RateLimiter:
from appkernel.rate_limit import RateLimitConfig, RateLimitMiddleware
class RedisRateLimiter:
def __init__(self, redis_client, cfg: RateLimitConfig):
self._redis = redis_client
self._cfg = cfg
def check(self, request) -> tuple[bool, int]:
# implement sliding window in Redis using INCR + EXPIRE
...
limiter = RedisRateLimiter(redis_client, RateLimitConfig(requests_per_window=100))
kernel.app.add_middleware(RateLimitMiddleware, limiter=limiter)