GitHub Copilot · Rate Limits

Github Copilot Rate Limits

GitHub Copilot management APIs ride on top of the GitHub REST API and inherit GitHub's primary and secondary rate limits (per-user, per-installation, and per-IP). Copilot itself enforces consumption via per-plan premium request quotas (50/300/1,500/month for Free/Pro/Pro+) rather than per-second API throttling for end-user inference. Inference traffic is gated by the plan's premium request budget and per-model rate caps applied transparently in the IDE/CLI.

11 Limits Throttle: 429

Rate LimitingAIDeveloper Tools

Limits

Unauthenticated REST (per IP) IP

requests_per_hour · hour

Authenticated REST (PAT / OAuth user token) user

requests_per_hour · hour

5000

GitHub App installations installation

requests_per_hour · hour

5000

Scales up to 12,500/hour based on repos/users; 15,000/hour on Enterprise Cloud.

GitHub Actions GITHUB_TOKEN repository

requests_per_hour · hour

1000

Concurrent requests (REST + GraphQL) user

concurrent_requests

100

Secondary - REST CPU points user

points_per_minute · minute

900

Secondary - content-creating requests user

requests_per_minute · minute

Secondary - content-creating requests (hourly) user

requests_per_hour · hour

500

Copilot premium requests (Free) user

requests_per_month · month

Plan-level quota for premium model invocations; not a request-rate cap.

Copilot premium requests (Pro) user

requests_per_month · month

300

Copilot premium requests (Pro+) user

requests_per_month · month

1500

Policies

Backoff Strategy

Honor Retry-After header on 429/403 (rate-limit) responses; otherwise exponential backoff with jitter.

Primary vs secondary limits

GitHub distinguishes primary (numeric per-hour limits) from secondary (abuse-prevention rules like CPU points and content-creation caps). Both can return 429.

Premium request budget

Copilot inference (chat, agent mode, code review) consumes premium requests; once exhausted, paid plans bill overage at $0.04/request and free plans pause premium features until the next month.

Conditional requests

Use ETag / If-None-Match for read-mostly endpoints — 304 responses do not count against the primary rate limit.

Github Copilot Rate Limits

Limits

Policies

Sources