xAI · Rate Limits

Xai Rate Limits

xAI enforces per-team rate limits on the synchronous API (requests per minute and tokens per minute) that vary by model and account tier. The Batch API does not count toward synchronous rate limits. Specific per-model limits are not reconciled in this artifact; consult the xAI Console for the active limits on your team.

3 Limits Throttle: 429

AILLMFoundation ModelsGrokGenerative AIRate LimitingQuotasThrottling

Limits

Requests Per Minute (RPM) team

requests

see provider documentation

Per-model RPM, varies by tier and model. Pending reconciliation.

Tokens Per Minute (TPM) team

tokens

see provider documentation

Per-model TPM, varies by tier and model. Pending reconciliation.

Batch API team

requests

not counted against synchronous limits

Batch jobs run asynchronously and do not consume synchronous RPM/TPM.

Policies

Backoff Strategy

Clients should implement exponential backoff with jitter and honor any Retry-After header.

Tiered Limits

Higher usage tiers and enterprise agreements unlock higher per-minute limits.

Xai Rate Limits

Limits

Policies

Sources