Google Gemini · Rate Limits
Google Gemini Rate Limits
Gemini API rate limits are scoped per usage tier (Free, Tier 1, Tier 2, Tier 3) and per model. Each tier defines RPM (requests per minute), TPM (tokens per minute), and RPD (requests per day) ceilings. Tier promotion is automatic based on cumulative spend and account age. Specific numerical limits are not statically published per model; they are visible in Google AI Studio per project. The Batch API has separate limits.
7 Limits
Throttle: 429
Quota: 403
Generative AILLMGoogleRate Limiting
Limits
Free tier project
see AI Studio rate-limit page
Active project or free trial. Lowest RPM/TPM/RPD ceilings; varies by model.
Tier 1 project
250
Linked billing account. $250 monthly spend cap; higher RPM/TPM/RPD than Free.
Tier 2 project
2000
Reached after $100+ spent and 3 days on the account. $2,000 monthly spend cap.
Tier 3 project
100000
Reached after $1,000+ spent and 30 days. Spend cap ranges $20,000-$100,000+ subject to review.
Batch API concurrent batch requests project
100
Batch API input file size batch_request
2147483648
2 GB.
Batch API enqueued tokens project/model
see model-specific batch quota
Ranges from millions to billions depending on model and tier.
Policies
Tier promotion
Tier upgrade is automatic when spend / account-age thresholds are met. Higher tiers grant higher RPM/TPM/RPD across all models in your project.
Live rate-limit visibility
View your current per-model RPM/TPM/RPD in the Google AI Studio Rate Limits page; programmatic values are not statically documented because they change per tier and per model.
429 backoff
On 429 ResourceExhausted, retry with exponential backoff with jitter; respect any retry hint metadata returned by the API.
Batch API discount
Batch API offers 50% lower per-token cost vs synchronous calls and uses separate quota buckets - a primary FinOps lever for non-latency-sensitive workloads.
Vertex AI alternative
For higher throughput needs, use Gemini via Vertex AI with provisioned throughput; quotas are governed by Vertex AI quotas and are raisable through the Cloud Console.