Amazon API Gateway · Rate Limits

Aws Api Gateway Rate Limits

AWS API Gateway enforces a regional account-level throttle plus per-API stage / method / usage-plan throttles. Defaults are 10,000 RPS per account per Region with a 5,000-request burst bucket; some regions default to 2,500 RPS / 1,250 burst. Control-plane operations (CreateRestApi, CreateDeployment, etc.) have separate fixed quotas. Account-level throttles are increasable via Service Quotas; control-plane throttles are not.

12 Limits Throttle: 429
API ManagementServerlessRate LimitingThrottling

Limits

Account-level throttle (default Regions) account/region
requests_per_second · second
10000
Across HTTP APIs, REST APIs, WebSocket APIs, and WebSocket callback APIs. Burst 5,000 via token-bucket. Increasable via Service Quotas (L-8A5B8E43).
Account-level throttle (low-default Regions) account/region
requests_per_second · second
2500
Lower default in: Africa (Cape Town), Europe (Milan / Spain / Zurich), Asia Pacific (Jakarta / Hyderabad / Melbourne / Malaysia / Thailand), Middle East (UAE), Israel (Tel Aviv), Canada West (Calgary), Mexico (Central). Burst 1,250.
Per-method throttle (REST API stage/method) stage-method
requests_per_second
-1
Configured per method via stage settings or usage plan; defaults to inheriting the account-level throttle. Burst configurable.
Usage plan throttle api-key
requests_per_second
-1
Enforced per API key bound to a usage plan; rate, burst, and monthly quota all configurable.
Portal throttle without access control account/region/portal
requests_per_second · second
250000
Hard cap; not increasable.
Portal throttle with access control account/region/portal
requests_per_second · second
10000
Hard cap; not increasable.
CreateApiKey account
requests_per_second · second
5
Control-plane fixed quota.
CreateDeployment account
requests_per_second · second
0.2
1 request every 5 seconds. Fixed.
CreateRestApi (Regional / private) account
requests_per_second · second
0.333
1 request every 3 seconds. Fixed.
CreateRestApi (edge-optimized) account
requests_per_second · second
0.0333
1 request every 30 seconds. Fixed.
PutRestApi account
requests_per_second · second
1
Other control-plane operations (aggregate) account
requests_per_second · second
10
10 RPS aggregate with 40 RPS burst across "other" operations. Not increasable.

Policies

Token Bucket Burst
Account-level throttle uses a token-bucket algorithm; bursts above the steady-state RPS are tolerated up to the bucket capacity, then throttled. The burst quota is set by AWS, not customer-configurable.
Backoff Strategy
AWS SDKs implement exponential backoff with jitter automatically on 429 / 5xx; clients using raw HTTP should mirror this and honor Retry-After when present.
Service Quotas Increases
Account-level RPS quotas can be raised via the Service Quotas console (quota code L-8A5B8E43); control-plane and portal quotas are fixed and not increasable.
Per-Method / Per-Stage / Usage-Plan Layering
Throttling is layered. The most restrictive of (account regional throttle / stage throttle / method throttle / usage plan throttle) wins for a given request.
WAF Layering
AWS WAF rules apply before throttling; rate-based WAF rules can shape per-IP traffic independently of API Gateway throttles.

Sources