Mistral AI · Rate Limits
Mistral Ai Rate Limits
Mistral AI's la Plateforme exposes a chat-completions API at api.mistral.ai/v1 with per-account, per-model rate limits enforced as requests-per-second and tokens-per-minute. Specific per-tier numbers are not displayed on the public docs / pricing pages we sampled — they are surfaced in-product on the la Plateforme console and can be raised via support. 429 with Retry-After indicates throttling.
3 Limits
Throttle: 429
Rate LimitingAILarge Language Models
Limits
Requests per second (per model, per workspace) account
See la Plateforme console; not publicly published per tier
Tokens per minute (per model, per workspace) account
See la Plateforme console; not publicly published per tier
Concurrent requests account
See la Plateforme console
Policies
Honor Retry-After
429 responses include Retry-After (seconds). Honor the value before retrying with exponential backoff and jitter.
Per-model scoping
Limits are enforced per-model — heavy use of one model does not throttle others unless the workspace-wide budget is hit.
Tier upgrades
Higher per-tier rate limits are unlocked by adding a payment method and incurring usage; explicit limit raises can be requested via support.
Reasoning models
Reasoning-effort settings can spike output tokens significantly; size token-per-minute caps for the worst case.