Bifrost · Rate Limits
Bifrost Rate Limits
Bifrost is a self-hosted gateway, so there are no provider-imposed rate limits — limits are whatever the operator configures via virtual keys, budgets, and per-key rate-limit policies. Effective ceilings are governed by (1) the host's compute/network capacity and (2) the upstream LLM provider's own rate limits, which Bifrost surfaces via fallback and load-balancing logic.
3 Limits
Throttle: 429
AI GatewayLLMOpen SourceRate Limiting
Limits
Virtual Key Request Rate virtual-key
operator-configured
Bifrost virtual keys carry rate-limit policy set by the deploying operator.
Virtual Key Budget virtual-key
operator-configured
Per-key budget policy enforces upstream-cost spend caps.
Upstream Provider Limits provider/model
inherited from upstream provider (OpenAI, Anthropic, etc.)
Bifrost transparently respects and falls back across upstream provider rate limits.
Policies
Adaptive Load Balancing (Enterprise)
The Enterprise edition redistributes load across upstream providers based on observed latency and error rate.
Automatic Fallback
On 429 or 5xx from an upstream provider, Bifrost retries against a configured fallback provider/model.
Semantic Caching
Cache hits avoid upstream calls entirely, reducing effective rate-limit pressure.
Operator-Owned Configuration
Because Bifrost is self-hosted, all rate-limit thresholds are owned by the operator — not the project.