Power BI · Rate Limits

Power Bi Rate Limits

Power BI REST APIs apply per-user throttling to prevent overuse, returning HTTP 429 with a Retry-After header when a calling identity exceeds the documented per-time-window threshold. Numeric per-second/minute ceilings are not exhaustively published; specific operation groups (admin, embed-token, capacity, dataset refresh) carry their own published per-hour and per-day caps in the relevant operation reference. For Power BI Embedded and Fabric F-SKU capacity, throughput is governed by the v-core capacity rather than by API quota: utilization is evaluated every 30 seconds and capacities exceeding 100% enter "interactive request delay" mode.

4 Limits Throttle: 429
AnalyticsBusiness IntelligenceMicrosoftRate Limiting

Limits

Per-user request throttling user
varies
per-operation thresholds; see operation reference
Power BI throttles when a single user exceeds the per-operation time-window threshold; HTTP 429 + Retry-After is returned.
Embed-token generation (Pro / PPU) user
varies
limited number per master account or service principal (development testing only)
Production embedding requires assigning a Power BI Embedded (A SKU) or Fabric (F SKU) capacity; capacity removes the embed-token cap.
Embedded / Premium capacity utilization capacity
v-core_seconds
30 seconds of CPU per v-core per 30-second cycle
Each evaluation cycle aggregates interactive (current cycle) plus background (1/2880 of past 24h) operations.
Capacity overload behavior capacity
percent_overload
no delay below 10% overload; up to 20s interactive delay at 100% overload
Capacity remains in interactive-request-delay mode until the previous evaluation drops below 100% utilization.

Policies

Honor Retry-After
On 429, clients must wait the number of seconds indicated in Retry-After before retrying.
Capacity-bound throttling
For Embedded and Fabric workloads, scale up the v-core capacity (or use autoscale where supported) rather than retrying through 429 storms.
Cross-region operations
Operations that download files from a different region than the call origin may take longer; treat regional differences as latency, not throttling.
Operation-level limits
Specific operations (admin scans, dataset refresh, embed-token generation) have their own caps documented in the per-operation reference; consult the operation page rather than assuming a uniform global limit.
Development vs production embedding
Trial embed tokens with Pro/PPU are for development testing only; production embedding must run on purchased capacity.

Sources