Inflection AI · Rate Limits
Inflection Rate Limits
Inflection AI's Developer API exposes a usage dashboard at developers.inflection.ai/usage that surfaces account-level rate-limit consumption. The exact RPM / TPM values per model and tier are not publicly documented and are pending reconciliation; in practice they are set per contract.
3 Limits
Throttle: 429
AILLMPersonal AIPiFoundation ModelsRate LimitingQuotasThrottling
Limits
Requests Per Minute (RPM) account
see provider documentation
Per-model RPM, varies by contract. Pending reconciliation.
Tokens Per Minute (TPM) account
see provider documentation
Per-model TPM, varies by contract. Pending reconciliation.
On-Premise Throughput deployment
bounded by purpose-built server capacity
On-premise deployments are limited by the configured server hardware.
Policies
Backoff Strategy
Clients should implement exponential backoff with jitter and honor Retry-After.
Tiered Limits
Higher limits unlocked through Enterprise contract.