Microsoft Azure Api Management Rate Limits

Azure API Management is itself a rate-limit/quota engine for downstream APIs. Built-in policies (rate-limit, rate-limit-by-key, quota, quota-by-key) let operators throttle by subscription key, IP, or arbitrary expression. Service-level capacity caps depend on the tier (scale units).

4 Limits Throttle: 429
Rate LimitingAPI GatewayAPI ManagementMicrosoft Azure

Limits

Consumption tier per-subscription subscription
requests_per_minute
see policy definition; default no service-level cap
Operators define limits via rate-limit / quota policies; consumption tier is metered per call rather than capped at the service level.
Tier scale units service
varies
Developer 1, Basic 2, Basic v2 10, Standard 4, Standard v2 10, Premium 12 per region, Premium v2 30
Each scale unit yields documented gateway throughput; configure rate-limit policies on top.
Rate-limit policy (per minute) subscription/key/expression
requests_per_minute
configurable per policy
Quota policy (per period) subscription/key/expression
requests_per_period
configurable per policy

Policies

rate-limit and rate-limit-by-key
Apply per-minute (or sub-minute via fixed-period) limits per subscription, key, or arbitrary key expression. Returns 429 with Retry-After when exceeded.
quota and quota-by-key
Apply longer-window (renewable per hour, day, week, month) call or bandwidth quotas per subscription/key.
Capacity vs throttling
Tier scale units define peak capacity; operators must size scale units to match configured policy ceilings.
Honor Retry-After
Clients should honor the Retry-After header returned with 429 responses when the gateway throttles a request.

Sources