Kubernetes Services · Rate Limits
Kubernetes Services Rate Limits
The Kubernetes Services API itself does not impose request-per-second limits on data-plane traffic to a Service. Control-plane operations (creating, updating, listing Services through the kube-apiserver) are subject to the API server's API Priority and Fairness (APF) just like any other Kubernetes API resource. Data-plane throughput depends on the chosen Service type and the underlying cloud-provider load balancer.
2 Limits
Throttle: 429
Rate LimitingKubernetesNetworkingOpen Source
Limits
kube-apiserver Service CRUD (control plane) priority-level
governed by API Priority and Fairness PriorityLevelConfiguration
Service data-plane throughput load-balancer
depends on cloud-provider LB type / instance class — see provider quotas
Policies
APF on control plane
Service create/update/delete/list operations against kube-apiserver are subject to API Priority and Fairness; high-frequency reconciliation should use watches and informers rather than polling.
kube-proxy / iptables / IPVS scaling
ClusterIP and NodePort throughput depends on the kube-proxy mode (iptables, IPVS, nftables) and node kernel networking; not a documented numeric limit.
External LB quotas
Type=LoadBalancer Services provision a cloud-provider LB; per-LB and per-account quotas (NLB connection limits, ALB rule limits, GCLB forwarding-rule limits) apply.
Endpoint slice fan-out
Very large Services (thousands of endpoints) should rely on EndpointSlice (default in modern Kubernetes) rather than the legacy Endpoints object to avoid kube-apiserver load.