TensorFlow · Rate Limits

Tensorflow Rate Limits

TensorFlow is an open-source library, not a hosted API service; there is no central TensorFlow rate limiter. Inference throughput is bounded only by the hardware and serving stack (TensorFlow Serving, TF Lite runtime, custom inference servers) the user deploys.

1 Limits

Rate LimitingMachine LearningOpen Source

Limits

Self-Hosted Inference deployment

varies

bounded by self-hosted hardware / serving stack

Policies

Self-Hosted Throughput

Throughput is a function of the user's deployment (CPU/GPU/TPU, batch size, model size, serving stack); TensorFlow itself does not throttle.

Sources

https://www.tensorflow.org/
https://www.tensorflow.org/tfx/serving/serving_basic