Qubrid AI · Rate Limits

Qubrid Ai Rate Limits

Qubrid AI does not publish numeric per-second or per-minute API rate limits on its public pricing or product pages. The serverless inference endpoint is OpenAI-compatible and exposed via the Qubrid platform; per-account throttling and quotas are managed inside the console once an account is funded. Concrete TPS/RPM ceilings should be confirmed via the Qubrid API key dashboard or sales team.

2 Limits Throttle: 429

Artificial IntelligenceInferenceGPURate Limiting

Limits

Inference Rate Limit api-key

varies

see Qubrid platform console

Per-API-key throttling not published on the public pricing page; check the platform dashboard.

GPU Provisioning Quota account

varies

see Qubrid platform console

GPU VM and bare-metal availability is provisioned on request; capacity caps depend on the contract term and inventory.

Policies

Backoff Strategy

Clients hitting 429 responses on the OpenAI-compatible inference endpoint should apply exponential backoff with jitter.

Quota Source

Numeric request and token quotas are set per-API-key in the Qubrid platform console; enterprise customers can negotiate higher limits via sales.

Qubrid Ai Rate Limits

Limits

Policies

Sources