TensorFlow · Rate Limits

Tensorflow Rate Limits

TensorFlow is an open-source library, not a hosted API service; there is no central TensorFlow rate limiter. Inference throughput is bounded only by the hardware and serving stack (TensorFlow Serving, TF Lite runtime, custom inference servers) the user deploys.

1 Limits
Rate LimitingMachine LearningOpen Source

Limits

Self-Hosted Inference deployment
varies
bounded by self-hosted hardware / serving stack

Policies

Self-Hosted Throughput
Throughput is a function of the user's deployment (CPU/GPU/TPU, batch size, model size, serving stack); TensorFlow itself does not throttle.

Sources