Inflection AI · Rate Limits

Inflection Rate Limits

Inflection AI's Developer API exposes a usage dashboard at developers.inflection.ai/usage that surfaces account-level rate-limit consumption. The exact RPM / TPM values per model and tier are not publicly documented and are pending reconciliation; in practice they are set per contract.

3 Limits Throttle: 429

AILLMPersonal AIPiFoundation ModelsRate LimitingQuotasThrottling

Limits

Requests Per Minute (RPM) account

requests

see provider documentation

Per-model RPM, varies by contract. Pending reconciliation.

Tokens Per Minute (TPM) account

tokens

see provider documentation

Per-model TPM, varies by contract. Pending reconciliation.

On-Premise Throughput deployment

requests

bounded by purpose-built server capacity

On-premise deployments are limited by the configured server hardware.

Policies

Backoff Strategy

Clients should implement exponential backoff with jitter and honor Retry-After.

Tiered Limits

Higher limits unlocked through Enterprise contract.

Inflection Rate Limits

Limits

Policies

Sources