Google Cloud Platform · Rate Limits

Google Cloud Platform Rate Limits

Google Cloud rate limits and quotas are scoped per-API and per-project. Each Cloud service has its own quota policy on its `Quotas` page; cross-cutting limits also apply at the Cloud Resource Manager and IAM control plane. Quotas are typically soft and raisable through the Cloud Console quotas page with a 1-3 day review cadence. Numbers below are platform-wide patterns; consult per-service quota pages for exact values.

7 Limits Throttle: 429 Quota: 403
Cloud PlatformMulti-ProductGoogle CloudRate Limiting

Limits

Per-service per-project quotas project/service
varies
see per-service quotas page
Each Cloud service publishes its own quotas; default values typically scale to standard production use and can be raised.
Default API key request rate api-key
requests_per_second
see api-key quotas
Standard API key quotas (e.g., 10,000 requests/100 seconds for many APIs). Override via the Cloud Console.
Cloud Resource Manager API project
requests_per_minute · minute
600
IAM read requests project
requests_per_minute · minute
6000
IAM write requests project
requests_per_minute · minute
60
Service Usage management project
requests_per_minute · minute
600
Compute Engine API requests project
requests_per_100_seconds
1500
Default quota for the Compute Engine API; raisable.

Policies

Per-service quotas
Each Cloud service exposes its own quotas under Cloud Console > IAM & Admin > Quotas. The platform-level umbrella here is for orientation; always confirm against the per-service docs.
Quota raises
Most quotas are soft. Submit a quota increase request via the Cloud Console; production raises typically resolve within 1-3 business days.
Backoff and retry
Implement exponential backoff with jitter on 429 / 403 quotaExceeded; respect Retry-After headers and propagate idempotency keys where supported (e.g., requestId for Compute, ETags for resource mutations).
Project-scoped isolation
Quotas are enforced per-project (and sometimes per-region or per-zone) by default; spread workloads across projects to scale beyond a single quota.
SLA-backed rate limiting
Some services (Spanner, Bigtable, Pub/Sub) use capacity-based throttling rather than hard quotas; provision capacity to absorb traffic.

Sources