Kubernetes Services · Rate Limits

Kubernetes Services Rate Limits

The Kubernetes Services API itself does not impose request-per-second limits on data-plane traffic to a Service. Control-plane operations (creating, updating, listing Services through the kube-apiserver) are subject to the API server's API Priority and Fairness (APF) just like any other Kubernetes API resource. Data-plane throughput depends on the chosen Service type and the underlying cloud-provider load balancer.

2 Limits Throttle: 429

Rate LimitingKubernetesNetworkingOpen Source

Limits

kube-apiserver Service CRUD (control plane) priority-level

concurrent_requests

governed by API Priority and Fairness PriorityLevelConfiguration

Service data-plane throughput load-balancer

varies

depends on cloud-provider LB type / instance class — see provider quotas

Policies

APF on control plane

Service create/update/delete/list operations against kube-apiserver are subject to API Priority and Fairness; high-frequency reconciliation should use watches and informers rather than polling.

kube-proxy / iptables / IPVS scaling

ClusterIP and NodePort throughput depends on the kube-proxy mode (iptables, IPVS, nftables) and node kernel networking; not a documented numeric limit.

External LB quotas

Type=LoadBalancer Services provision a cloud-provider LB; per-LB and per-account quotas (NLB connection limits, ALB rule limits, GCLB forwarding-rule limits) apply.

Endpoint slice fan-out

Very large Services (thousands of endpoints) should rely on EndpointSlice (default in modern Kubernetes) rather than the legacy Endpoints object to avoid kube-apiserver load.

Kubernetes Services Rate Limits

Limits

Policies

Sources