Use Open WebUI with GammaInfra

Open WebUI is a great self-hosted ChatGPT-style UI. Configure it once with a GammaInfra connection and your local UI can talk to every major LLM through one API key, with cost visible per request and automatic provider fallback.

What changes with GammaInfra

One connection covers every model. Instead of configuring separate connections to OpenAI, Anthropic, Google, etc., add GammaInfra once and the entire catalog (gpt-5, claude-opus-4-7, gemini-3.1-pro, deepseek-v4-pro, llama-3.3-70b, etc.) appears in Open WebUI's model dropdown.
Cost per request. Open WebUI doesn't show cost natively, but GammaInfra's dashboard rolls up exact per-request spend.
Fallback when one provider throttles. Your local UI doesn't break when OpenAI rate-limits — GammaInfra cascades transparently.

Setup

1. Get a GammaInfra API key

2. Open Open WebUI admin settings

Sign in to your Open WebUI instance as an admin
Click your profile avatar → Admin Panel
Settings → Connections

3. Add an OpenAI API connection

Under "OpenAI API", click the + button to add a new connection:

API Base URL: https://api.gammainfra.com/v1
API Key:      sk-gammainfra-...

Click the refresh/test button next to the connection. Open WebUI will fetch the model list via GET /v1/models — if the test succeeds, GammaInfra's models populate the model dropdown.

4. Optional: name the connection

Newer Open WebUI versions allow naming the connection (e.g. "GammaInfra"). This appears as a prefix in the model dropdown so you can distinguish GammaInfra-routed models from any other direct connections you keep.

5. Optional: filter the model list

GammaInfra's /v1/models response returns the full catalog (40+ models). To keep the dropdown clean, use Open WebUI's Model Filtering setting to show only the models you actually use — for example: gammainfra/auto, anthropic/claude-opus-4-7, openai/gpt-5-mini, deepseek/deepseek-v4-pro.

Multi-user instances: If you run Open WebUI for a team, all users share the single GammaInfra API key by default. For per-user attribution, give each user their own GammaInfra key (one signup each, $3 trial credit each) and have them configure their own connection — or use Open WebUI's per-user API key feature to override the admin connection.

Recommended models to expose

For a general self-hosted chat UI, this is a reasonable starting set:

gammainfra/auto — default chat. Task-aware routing picks the best fit per prompt.
anthropic/claude-opus-4-7 — for users who want quality reasoning.
openai/gpt-5 — OpenAI flagship.
google/gemini-3.1-pro-preview — strong multimodal.
deepseek/deepseek-v4-pro — cheap reasoning with thinking mode.
gammainfra/cheap — for high-volume use cases.

Function calling, vision, streaming — work as-is

Open WebUI's tool-call and vision features rely on the OpenAI protocol. GammaInfra passes these through to the underlying provider unchanged. Vision works on models that support it (gpt-5, claude-opus-4-7, gemini-3.1-pro); attempting vision on a text-only model returns the underlying provider's error.

Trade-offs

Latency. ~10–50 ms overhead per request. Imperceptible for chat-UI use.
Cost. 3% top-up fee (launch) / 5% standard. Pass-through provider rates on tokens — no markup. BYOK 1–2% per request if you want to use your own provider keys.
Privacy. Prompts and responses aren't logged by default. Privacy policy. Your Open WebUI instance still stores chat history locally per its own config.

Troubleshoot

Model list is empty. The test request failed. Check the Base URL has /v1 at the end (not just https://gammainfra.com) and the API key is valid.
402 Payment required. Trial credit exhausted. Top up in the dashboard.
Streaming hangs. Open WebUI's SSE handling occasionally trips on certain providers. Try a direct-pin model (openai/gpt-5-mini) to confirm GammaInfra is fine; if a specific model misbehaves, report it via Discord.

Ready to try it?

Get a GammaInfra API key →

$3 free trial credit on signup, $10 minimum top-up. Pass-through provider token rates plus 3% top-up fee during the launch window.

Frequently asked questions

How do I add GammaInfra as a connection in Open WebUI?

In Open WebUI admin Settings → Connections → OpenAI API, click + Add Connection. URL: https://api.gammainfra.com/v1. Key: your sk-gammainfra-... token. Test the connection. Once it returns green, the models endpoint populates the model picker with every model GammaInfra exposes.

Does Open WebUI's model picker show gammainfra/auto?

Yes. GammaInfra's /v1/models returns gammainfra/auto, gammainfra/fast, gammainfra/cheap, plus every directly pin-able provider model. They all appear in Open WebUI's picker once the connection is added. Pin gammainfra/auto as the default for cost-aware smart routing.

Can users in my Open WebUI instance see their per-message cost?

Open WebUI doesn't currently parse X-GammaInfra-Cost-USD headers into its UI. An admin can monitor total cost via the GammaInfra dashboard. For per-user attribution, issue separate API keys per user (or per Open WebUI group) and filter the dashboard by key.

What if my self-hosted Open WebUI hits the gateway's rate limit?

The default 240 rpm per-API-key cap applies. For a heavily-used Open WebUI instance, either (a) issue multiple API keys and round-robin them, (b) ask GammaInfra support to raise the cap on your key (we provision higher caps for steady-state volume), or (c) configure Open WebUI to BYOK your provider keys directly to spread load across provider accounts.

Does GammaInfra work with Open WebUI's RAG features?

Yes. Open WebUI's RAG runs the retrieval step locally (vector search over your uploaded documents), then sends the augmented prompt as a regular chat completion. GammaInfra sees the augmented prompt and routes normally. For RAG workloads with long retrieved context, watch the X-GammaInfra-Cost-USD header carefully — long-context surcharges on some models can change the economics.