Jina AI · Rate Limits

Jina Ai Rate Limits

Jina AI applies per-API-key rate limits across three tiers (Free, Paid, Premium). Limits are enforced as RPM (requests per minute), TPM (tokens per minute), and concurrent in-flight requests and apply uniformly across all Search Foundation services (Embeddings, Reranker, Reader, Classifier, Segmenter, DeepSearch). Tier is determined by token balance / billing status on the key.

9 Limits Throttle: 429 Quota: 402
Rate LimitingAIEmbeddingsLLM

Limits

Free API Key — RPM api-key
requests_per_minute · minute
100
Free API Key — TPM api-key
tokens_per_minute · minute
100000
Free API Key — concurrency api-key
concurrent_requests
2
Paid API Key — RPM api-key
requests_per_minute · minute
500
Paid API Key — TPM api-key
tokens_per_minute · minute
2000000
Paid API Key — concurrency api-key
concurrent_requests
50
Premium API Key — RPM api-key
requests_per_minute · minute
5000
Premium API Key — TPM api-key
tokens_per_minute · minute
50000000
Premium API Key — concurrency api-key
concurrent_requests
500

Policies

Tier upgrade
Tier (Free / Paid / Premium) is determined by token balance / billing status on the API key; topping up tokens promotes the key to a higher tier with higher RPM, TPM, and concurrency.
Backoff
Honor Retry-After on 429 and back off exponentially with jitter.
Token exhaustion
When the token balance reaches zero the API returns 402 Payment Required; top up via Stripe in the API Dashboard.
Shared across services
Rate-limit and concurrency caps apply across all Jina services on the same key, not per-endpoint.

Sources