Qwen · Rate Limits

Qwen Rate Limits

Alibaba Cloud Model Studio enforces per-model RPM/TPM and concurrent-task quotas. Limits are configurable per workspace and visible in the console; defaults vary by model and tier.

3 Limits Throttle: 429
AILLMAlibabaRate Limiting

Limits

Default RPM account
requests-per-minute
see workspace console
Per-model defaults; varies by Qwen variant.
Default TPM account
tokens-per-minute
see workspace console
Concurrent Tasks account
concurrent-tasks
see workspace console
Image/video generation has separate concurrency caps.

Policies

Backoff Strategy
Exponential backoff with jitter; honor Retry-After.
Limit Increase
Submit a quota request via Alibaba Cloud console for higher RPM/TPM.

Sources