Kubernetes Services · Rate Limits

Kubernetes Services Rate Limits

The Kubernetes Services API itself does not impose request-per-second limits on data-plane traffic to a Service. Control-plane operations (creating, updating, listing Services through the kube-apiserver) are subject to the API server's API Priority and Fairness (APF) just like any other Kubernetes API resource. Data-plane throughput depends on the chosen Service type and the underlying cloud-provider load balancer.

2 Limits Throttle: 429
Rate LimitingKubernetesNetworkingOpen Source

Limits

kube-apiserver Service CRUD (control plane) priority-level
concurrent_requests
governed by API Priority and Fairness PriorityLevelConfiguration
Service data-plane throughput load-balancer
varies
depends on cloud-provider LB type / instance class — see provider quotas

Policies

APF on control plane
Service create/update/delete/list operations against kube-apiserver are subject to API Priority and Fairness; high-frequency reconciliation should use watches and informers rather than polling.
kube-proxy / iptables / IPVS scaling
ClusterIP and NodePort throughput depends on the kube-proxy mode (iptables, IPVS, nftables) and node kernel networking; not a documented numeric limit.
External LB quotas
Type=LoadBalancer Services provision a cloud-provider LB; per-LB and per-account quotas (NLB connection limits, ALB rule limits, GCLB forwarding-rule limits) apply.
Endpoint slice fan-out
Very large Services (thousands of endpoints) should rely on EndpointSlice (default in modern Kubernetes) rather than the legacy Endpoints object to avoid kube-apiserver load.

Sources