Use Continue with GammaInfra
Continue lets you wire up different LLMs for different editor roles — chat, autocomplete, edit, apply. Done well, that's powerful. Done by hand, it's a JSON file with 4–7 nearly-identical entries. Point Continue at GammaInfra and one entry covers every provider, with task-aware routing or per-role pins.
The pain Continue users hit
- One entry per model, per role. A typical Continue config.json has separate model entries for chat, autocomplete, and edit. Each gets a provider, key, model name — and if you're using OpenAI for chat + Anthropic for autocomplete + a local model for apply, you maintain three sets of credentials.
- Switching models means re-editing JSON. Try a new model, fight the config, restart the extension.
- No unified cost picture. Continue doesn't show cost in the UI, so cost tracking lives elsewhere.
What changes with GammaInfra
Point Continue's openai provider at GammaInfra's smart router — one entry gives you the whole catalog through the OpenAI SDK shape Continue already speaks. Direct-pin specific models when you want; let gammainfra/auto decide when you don't.
Setup
1. Get a GammaInfra API key
Sign up at gammainfra.com and verify your email. $3 trial credit covers thousands of autocomplete completions or a few hundred Chat-panel responses.
2. Open Continue's config
Click the gear icon in the Continue panel (VS Code or JetBrains), or open ~/.continue/config.json directly. The file is JSON with models, tabAutocompleteModel, and other top-level keys.
3. Add a GammaInfra model entry
The minimal addition:
{
"models": [
{
"title": "GammaInfra Auto",
"provider": "openai",
"model": "gammainfra/auto",
"apiBase": "https://api.gammainfra.com/v1",
"apiKey": "sk-gammainfra-..."
}
]
}
Restart the Continue panel (Cmd/Ctrl + Shift + P → "Continue: Reload"). The "GammaInfra Auto" entry now appears in the model picker.
4. Configure per-role models (optional)
For sharper control, give each editor role its own model — still all going through GammaInfra so you keep the single API key:
{
"models": [
{
"title": "GammaInfra Auto",
"provider": "openai",
"model": "gammainfra/auto",
"apiBase": "https://api.gammainfra.com/v1",
"apiKey": "sk-gammainfra-..."
},
{
"title": "Claude Opus",
"provider": "openai",
"model": "anthropic/claude-opus-4-7",
"apiBase": "https://api.gammainfra.com/v1",
"apiKey": "sk-gammainfra-..."
}
],
"tabAutocompleteModel": {
"title": "Fast autocomplete",
"provider": "openai",
"model": "gammainfra/fast",
"apiBase": "https://api.gammainfra.com/v1",
"apiKey": "sk-gammainfra-..."
}
}
Now: Tab autocomplete uses the latency-optimized hedged path. Chat panel can switch between "GammaInfra Auto" (smart routing) and "Claude Opus" (forced quality tier).
"apply" and "edit" roles benefit from precise diff-aware models. Pin anthropic/claude-sonnet-4-6 or openai/gpt-5 for these via explicit role-model entries (see Continue's docs on roleModels) — same GammaInfra endpoint, different model name.
Model name choices
gammainfra/auto— task-aware routing. Reasonable default for the main Chat panel.gammainfra/fast— latency-optimized. Good fortabAutocompleteModel.gammainfra/cheap— cost-optimized. Good for high-volume scripts running through Continue's API.anthropic/claude-opus-4-7,openai/gpt-5,google/gemini-3.1-pro-preview— direct provider pins for quality tier.deepseek/deepseek-v4-pro— cheaper reasoning tier with built-in thinking mode.
Tips
- Use multiple entries to A/B compare. Add "GammaInfra + Opus" and "GammaInfra + GPT-5" as two entries — switch in the model picker without restarting Continue.
- Cost reporting. Continue doesn't show cost natively. The GammaInfra dashboard at dashboard.gammainfra.com shows per-request cost rolled up by model and date.
- Free-tier Continue users: If you're using Continue's free tier with API keys (instead of Continue Pro), GammaInfra's pass-through model means you only pay actual provider rates plus the top-up fee — no per-completion markup.
Trade-offs
- Latency. ~10–50 ms overhead per request. For autocomplete this is borderline noticeable; consider
gammainfra/fast(hedged) which often beats going direct. - Cost. Pass-through token rates + 3% top-up fee (launch window, 5% standard). BYOK at 1–2% per request if you bring your own provider keys.
- Privacy. Prompts and responses aren't logged by default. Privacy policy.
Ready to try it?
$3 free trial credit on signup, $10 minimum top-up. Pass-through provider token rates plus 3% top-up fee during the launch window.
Frequently asked questions
How do I point Continue at GammaInfra?
openai-type provider with base URL https://api.gammainfra.com/v1, your sk-gammainfra-... key, and a model such as gammainfra/auto. Continue's config file and field names have changed across versions (older JSON config, newer YAML) — follow the setup steps above and Continue's current docs for the version you run. The GammaInfra-side values (base URL, key, model names) are identical regardless of Continue version.Can I use a different model for Continue's autocomplete vs chat?
gammainfra/fast (latency-optimized) and chat at gammainfra/auto or a pinned model like anthropic/claude-opus-4-7. The exact field names for the two slots differ by Continue version — see Continue's current docs; on the GammaInfra side each slot is just a model name, nothing special is required.Does GammaInfra preserve Continue's MCP integration?
tools[] field of each request and forwards them to the resolved provider. Tool calling works end-to-end across the providers GammaInfra routes to (OpenAI, Anthropic, Google, Mistral, Groq, DeepSeek).Can I see per-completion cost in the Continue panel?
What happens during Continue's long-running edits if a provider times out?
X-GammaInfra-Max-Latency-Ms in the request headers (via Continue's custom-headers config) to cap the total time the gateway will spend cascading — useful if you want hard latency guarantees on edit flows.