Portkey · Rate Limits

Portkey Rate Limits

Portkey is an LLM gateway whose runtime quotas are dominated by the upstream provider being proxied (OpenAI, Anthropic, Bedrock, etc.); Portkey itself exposes plan-bound caps on recorded logs (10k/month on Developer, 100k/month on Production with overage to 3M, 10M+ on Enterprise) rather than per-second request throttling. Enterprise customers can configure granular budget and rate limits per virtual key and workspace. Concrete numeric request-per-second ceilings are not published on the public docs site at the time of writing.

5 Limits Throttle: 429 Quota: 429
AI GatewaysGovernanceObservabilityRate Limiting

Limits

Developer plan recorded logs account
requests_per_month · month
10000
Logs beyond cap are not recorded but requests still pass through.
Production plan recorded logs (included) account
requests_per_month · month
100000
$9 per additional 100k requests up to 3M then contact sales.
Production plan upper ceiling account
requests_per_month · month
3000000
Beyond 3M requests, account is moved to Enterprise discussion.
Enterprise per-virtual-key rate limit virtual-key
varies
configurable per contract
Enterprise admins can configure granular rate limits and budgets per virtual key, workspace, or service account.
Upstream provider throttling upstream
varies
governed by the proxied LLM provider (OpenAI, Anthropic, Bedrock, etc.)
Portkey passes through 429 responses from upstream LLM providers along with their Retry-After signaling.

Policies

Cap-and-drop logging
When recorded log quota is exceeded on Developer or Production tiers, requests still succeed but log records beyond the cap are dropped.
Pass-through throttling
When the upstream LLM provider throttles, Portkey returns the upstream 429 with whatever Retry-After header the provider supplied.
Granular enterprise limits
Enterprise tenants can scope budget and rate limits per virtual key, workspace, or environment for chargeback and blast-radius control.
Backoff
Clients should implement exponential backoff with jitter on 429 and 5xx responses.

Sources