Azure Virtual Machines · Rate Limits

Microsoft Azure Virtual Machines Rate Limits

Azure Virtual Machines API throttling is implemented by the Microsoft.Compute resource provider on top of Azure Resource Manager (ARM). Compute uses a token-bucket algorithm with two layers — per-resource and per-subscription — applied per region, with a 1-minute window. Each VM API category (Put, Update, Delete, Low-Cost-Get, High-Cost-Get, Get-Operation, Guest-Patch) has its own bucket sizes and refill rates. Throttled requests return HTTP 429 with the x-ms-ratelimit-remaining-resource header showing remaining tokens.

16 Limits Throttle: 429
Cloud ComputingComputeIaaSInfrastructureVirtual MachinesRate LimitingQuotas

Limits

Put VM (create) — resource bucket resource
requests_per_minute · minute
4
Refill 4/min, max bucket 12.
Put VM (create) — subscription bucket subscription/region
requests_per_minute · minute
500
Refill 500/min, max bucket 1500.
Update VM — resource bucket resource
requests_per_minute · minute
4
Refill 4/min, max bucket 12. Covers Update, Restart, Power Off, Start, Reapply, Capture, Run Command, Extension Create/Update/Delete, etc.
Update VM — subscription bucket subscription/region
requests_per_minute · minute
500
Refill 500/min, max bucket 1500.
Delete VM — resource bucket resource
requests_per_minute · minute
4
Refill 4/min, max bucket 12. Covers Delete, Simulate Eviction, Deallocate.
Delete VM — subscription bucket subscription/region
requests_per_minute · minute
500
Refill 500/min, max bucket 1500.
Low-Cost Get VM — resource bucket resource
requests_per_minute · minute
12
Refill 12/min, max bucket 36. Covers single-VM Get, Instance View, Extensions Get, List Available Sizes, Boot Diagnostics, Run Command Get/List.
Low-Cost Get VM — subscription bucket subscription/region
requests_per_minute · minute
8000
Refill 8000/min, max bucket 24000.
High-Cost Get VM — subscription only subscription/region
requests_per_minute · minute
300
Refill 300/min, max bucket 900. Covers List, List All, List By Location.
Get Async Operation — resource bucket resource
requests_per_minute · minute
15
Refill 15/min, max bucket 45. For polling status of async VM ops.
Get Async Operation — subscription bucket subscription/region
requests_per_minute · minute
5000
Refill 5000/min, max bucket 15000.
VM Guest Patch — resource bucket resource
requests_per_minute · minute
2
Refill 2/min, max bucket 6. Covers Assess Patches and Install Patches.
VM Guest Patch — subscription bucket subscription/region
requests_per_minute · minute
200
Refill 200/min, max bucket 600.
VM Scale Set Put — subscription bucket subscription/region
requests_per_minute · minute
125
Refill 125/min, max bucket 375.
VM Scale Set Update — subscription bucket subscription/region
requests_per_minute · minute
500
Refill 500/min, max bucket 1500.
VM Scale Set Delete — subscription bucket subscription/region
requests_per_minute · minute
175
Refill 175/min, max bucket 525.

Policies

Token bucket algorithm
Each policy category has a refill rate and max bucket capacity. Tokens are consumed per request; depletion triggers throttling. The bucket refills at the documented rate every minute.
Resource and subscription stacking
Both buckets must have tokens for a request to succeed. Resource limits prevent a single VM from exhausting subscription quota; subscription limits prevent a single subscription from impacting other tenants in the region.
Honor x-ms-ratelimit-remaining-resource
This header is returned with every Compute response and shows remaining tokens across applicable policies. Use it to back off proactively before getting 429.
Exponential backoff with Retry-After
SDKs auto-retry 429s with exponential backoff. Custom clients should honor Retry-After and add jitter.
Use batch / list operations
Prefer single List call over per-VM Get loop. Use Get Async Operation polling sparingly — long-poll fewer ops.
ARM token-bucket layering
In addition to Compute-specific limits, ARM-level subscription buckets (250 reads/25 per sec; 200 writes/10 per sec) also apply.

Sources