Google Cloud Platform · Rate Limits

Google Cloud Platform Rate Limits

Google Cloud rate limits and quotas are scoped per-API and per-project. Each Cloud service has its own quota policy on its `Quotas` page; cross-cutting limits also apply at the Cloud Resource Manager and IAM control plane. Quotas are typically soft and raisable through the Cloud Console quotas page with a 1-3 day review cadence. Numbers below are platform-wide patterns; consult per-service quota pages for exact values.

7 Limits Throttle: 429 Quota: 403

Cloud PlatformMulti-ProductGoogle CloudRate Limiting

Limits

Per-service per-project quotas project/service

varies

see per-service quotas page

Each Cloud service publishes its own quotas; default values typically scale to standard production use and can be raised.

Default API key request rate api-key

requests_per_second

see api-key quotas

Standard API key quotas (e.g., 10,000 requests/100 seconds for many APIs). Override via the Cloud Console.

Cloud Resource Manager API project

requests_per_minute · minute

600

IAM read requests project

requests_per_minute · minute

6000

IAM write requests project

requests_per_minute · minute

Service Usage management project

requests_per_minute · minute

600

Compute Engine API requests project

requests_per_100_seconds

1500

Default quota for the Compute Engine API; raisable.

Policies

Per-service quotas

Each Cloud service exposes its own quotas under Cloud Console > IAM & Admin > Quotas. The platform-level umbrella here is for orientation; always confirm against the per-service docs.

Quota raises

Most quotas are soft. Submit a quota increase request via the Cloud Console; production raises typically resolve within 1-3 business days.

Backoff and retry

Implement exponential backoff with jitter on 429 / 403 quotaExceeded; respect Retry-After headers and propagate idempotency keys where supported (e.g., requestId for Compute, ETags for resource mutations).

Project-scoped isolation

Quotas are enforced per-project (and sometimes per-region or per-zone) by default; spread workloads across projects to scale beyond a single quota.

SLA-backed rate limiting

Some services (Spanner, Bigtable, Pub/Sub) use capacity-based throttling rather than hard quotas; provision capacity to absorb traffic.

Google Cloud Platform Rate Limits

Limits

Policies

Sources