Inflection AI · Rate Limits

Inflection Rate Limits

Inflection AI's Developer API exposes a usage dashboard at developers.inflection.ai/usage that surfaces account-level rate-limit consumption. The exact RPM / TPM values per model and tier are not publicly documented and are pending reconciliation; in practice they are set per contract.

3 Limits Throttle: 429
AILLMPersonal AIPiFoundation ModelsRate LimitingQuotasThrottling

Limits

Requests Per Minute (RPM) account
requests
see provider documentation
Per-model RPM, varies by contract. Pending reconciliation.
Tokens Per Minute (TPM) account
tokens
see provider documentation
Per-model TPM, varies by contract. Pending reconciliation.
On-Premise Throughput deployment
requests
bounded by purpose-built server capacity
On-premise deployments are limited by the configured server hardware.

Policies

Backoff Strategy
Clients should implement exponential backoff with jitter and honor Retry-After.
Tiered Limits
Higher limits unlocked through Enterprise contract.

Sources