Open WebUI · Rate Limits

Open Webui Rate Limits

Open WebUI does not impose project-level API rate limits. Effective limits are determined by (1) the upstream LLM backend's limits (Ollama concurrency, or OpenAI/Anthropic RPM caps) and (2) any reverse-proxy or admin throttling configured in the deployment. Standard HTTP semantics apply.

2 Limits Throttle: 429

LLMOpen SourceSelf-HostedOllamaChat UIRAGRate LimitingQuotasThrottling

Limits

Project-level n/a

n/a

no built-in cap

Open WebUI itself does not throttle; configure at reverse proxy if needed.

Upstream LLM backend external

requests

backend-defined

Ollama concurrency or OpenAI/Anthropic RPM caps apply.

Policies

Reverse-Proxy Throttling

Use Nginx/Caddy/Traefik in front of Open WebUI to enforce per-IP or per-user limits if needed.

Backend Concurrency

Tune Ollama OLLAMA_NUM_PARALLEL or OPENAI_API_BASE_URLS to spread load.

Sources

https://docs.openwebui.com/
https://github.com/open-webui/open-webui