Bifrost · Rate Limits

Bifrost Rate Limits

Bifrost is a self-hosted gateway, so there are no provider-imposed rate limits — limits are whatever the operator configures via virtual keys, budgets, and per-key rate-limit policies. Effective ceilings are governed by (1) the host's compute/network capacity and (2) the upstream LLM provider's own rate limits, which Bifrost surfaces via fallback and load-balancing logic.

3 Limits Throttle: 429

AI GatewayLLMOpen SourceRate Limiting

Limits

Virtual Key Request Rate virtual-key

requests_per_minute

operator-configured

Bifrost virtual keys carry rate-limit policy set by the deploying operator.

Virtual Key Budget virtual-key

spend_per_period

operator-configured

Per-key budget policy enforces upstream-cost spend caps.

Upstream Provider Limits provider/model

varies

inherited from upstream provider (OpenAI, Anthropic, etc.)

Bifrost transparently respects and falls back across upstream provider rate limits.

Policies

Adaptive Load Balancing (Enterprise)

The Enterprise edition redistributes load across upstream providers based on observed latency and error rate.

Automatic Fallback

On 429 or 5xx from an upstream provider, Bifrost retries against a configured fallback provider/model.

Semantic Caching

Cache hits avoid upstream calls entirely, reducing effective rate-limit pressure.

Operator-Owned Configuration

Because Bifrost is self-hosted, all rate-limit thresholds are owned by the operator — not the project.

Bifrost Rate Limits

Limits

Policies

Sources