Open WebUI · Rate Limits
Open Webui Rate Limits
Open WebUI does not impose project-level API rate limits. Effective limits are determined by (1) the upstream LLM backend's limits (Ollama concurrency, or OpenAI/Anthropic RPM caps) and (2) any reverse-proxy or admin throttling configured in the deployment. Standard HTTP semantics apply.
2 Limits
Throttle: 429
LLMOpen SourceSelf-HostedOllamaChat UIRAGRate LimitingQuotasThrottling
Limits
Project-level n/a
no built-in cap
Open WebUI itself does not throttle; configure at reverse proxy if needed.
Upstream LLM backend external
backend-defined
Ollama concurrency or OpenAI/Anthropic RPM caps apply.
Policies
Reverse-Proxy Throttling
Use Nginx/Caddy/Traefik in front of Open WebUI to enforce per-IP or per-user limits if needed.
Backend Concurrency
Tune Ollama OLLAMA_NUM_PARALLEL or OPENAI_API_BASE_URLS to spread load.