Chroma · Rate Limits

Chroma Rate Limits

Chroma Cloud is serverless and meters Write / Storage / Query / Network rather than capping requests per second. Public documentation does not publish a numeric per-second or per-minute API rate-limit policy; tenants that exceed configured spend or per-database scaling envelopes are paused rather than throttled. The open-source Chroma server (self-hosted) has no built-in rate limiter — limits are bounded by the underlying host.

2 Limits Throttle: 429
AIVector DatabaseRetrievalServerlessRate Limiting

Limits

Cloud usage envelope tenant
varies
usage-bounded — Write / Storage / Query / Network are metered, not capped per-second
Service pauses if monthly usage exceeds plan limits (Starter / Team) or contracted ceiling (Enterprise).
Self-hosted server deployment
varies
no built-in rate limit — bounded by host CPU / RAM and any front-proxy throttling
Open-source Chroma does not enforce per-client rate limits; operators add a reverse-proxy throttle if needed.

Policies

Backoff Strategy
On 429 / 5xx, retry with exponential backoff and jitter; honor Retry-After when present.
Batch Writes
Use batched add / upsert calls rather than per-vector writes to keep the Write meter and request count efficient.
Pause on Quota
Cloud tenants are paused (not throttled) when plan usage limits are exceeded; bump plan or raise the Enterprise envelope to resume.

Sources