Rate Limiting¶
ARX enforces per-organization rate limits to protect the platform from abuse and ensure fair resource allocation. Rate limiting is backed by Redis using a sliding window algorithm.
Default Limits¶
| Endpoint Category | Default Limit |
|---|---|
| API endpoints | 1,000 requests per minute |
Auth endpoints (/v1/auth/*, /v1/sso/*, /v1/scim/*) |
100 requests per minute |
Auth endpoints have a lower limit to protect against credential stuffing and brute-force attacks.
How It Works¶
ARX uses a sliding window algorithm implemented with Redis sorted sets:
- Each request is timestamped and added to a sorted set keyed by organization (or client IP for unauthenticated requests).
- Entries older than the 60-second window are removed.
- The total count of entries in the window is compared against the limit.
- If the count exceeds the limit, the request is rejected with
429 Too Many Requests.
The sliding window approach avoids the burst problem of fixed windows, where a burst at the window boundary could allow double the intended rate.
Rate Limit Keys¶
| Context | Key Format |
|---|---|
| Authenticated (org-scoped) | arxsec:ratelimit:org:<org_id>:api or arxsec:ratelimit:org:<org_id>:auth |
| Unauthenticated (IP-scoped) | arxsec:ratelimit:ip:<client_ip>:api or arxsec:ratelimit:ip:<client_ip>:auth |
Authenticated requests are rate-limited by organization. All users in the same org share the org's rate limit pool. Unauthenticated requests are rate-limited by client IP using the X-Forwarded-For header (or the direct client address if the header is absent).
Custom Per-Organization Limits¶
Organizations can have custom rate limits stored in the orgs table:
| Column | Description |
|---|---|
rate_limit_rpm |
Custom API requests-per-minute limit |
rate_limit_auth_rpm |
Custom auth requests-per-minute limit |
If these columns are NULL, the default limits apply. Custom limits are cached in memory with a 5-minute TTL to minimize database lookups.
To set custom limits:
UPDATE orgs
SET rate_limit_rpm = 5000, rate_limit_auth_rpm = 500
WHERE id = '<org-id>';
The cache refreshes automatically; changes take effect within 5 minutes.
Response Headers¶
Every response includes rate limit headers:
| Header | Description |
|---|---|
X-RateLimit-Limit |
The maximum number of requests allowed in the current window |
X-RateLimit-Remaining |
The number of requests remaining in the current window |
X-RateLimit-Reset |
Unix timestamp when the current window resets |
When a request is throttled, the response also includes:
| Header | Description |
|---|---|
Retry-After |
Number of seconds to wait before retrying |
Handling 429 Responses¶
When you receive a 429 Too Many Requests response:
{
"detail": "Rate limit exceeded. Please retry after the specified time."
}
Recommended client behavior:
- Read the
Retry-Afterheader to determine the wait time. - Implement exponential backoff with jitter for repeated 429s.
- Avoid tight retry loops, which will extend the throttling period.
- For batch operations, spread requests evenly across the window rather than sending bursts.
Bypassed Paths¶
The following paths are exempt from rate limiting:
| Path | Reason |
|---|---|
/health |
Health check endpoint for load balancers |
/docs |
API documentation |
/redoc |
Alternative API documentation |
/openapi.json |
OpenAPI schema |
Fail-Open Behavior¶
If Redis is unavailable, the rate limiter fails open -- requests are allowed through without rate limiting. This ensures that a Redis outage does not cause a complete platform outage. A warning is logged (rate_limit.redis_unavailable) when this occurs.
Redis Configuration¶
Rate limiting requires a Redis instance. Configure the connection via environment variable:
REDIS_URL=redis://localhost:6379
The Redis client uses a 2-second connection and socket timeout to avoid blocking request processing.