xAI · Rate Limits
Xai Rate Limits
xAI enforces per-team rate limits on the synchronous API (requests per minute and tokens per minute) that vary by model and account tier. The Batch API does not count toward synchronous rate limits. Specific per-model limits are not reconciled in this artifact; consult the xAI Console for the active limits on your team.
3 Limits
Throttle: 429
AILLMFoundation ModelsGrokGenerative AIRate LimitingQuotasThrottling
Limits
Requests Per Minute (RPM) team
see provider documentation
Per-model RPM, varies by tier and model. Pending reconciliation.
Tokens Per Minute (TPM) team
see provider documentation
Per-model TPM, varies by tier and model. Pending reconciliation.
Batch API team
not counted against synchronous limits
Batch jobs run asynchronously and do not consume synchronous RPM/TPM.
Policies
Backoff Strategy
Clients should implement exponential backoff with jitter and honor any Retry-After header.
Tiered Limits
Higher usage tiers and enterprise agreements unlock higher per-minute limits.