Use Continue with GammaInfra

Continue lets you wire up different LLMs for different editor roles — chat, autocomplete, edit, apply. Done well, that's powerful. Done by hand, it's a JSON file with 4–7 nearly-identical entries. Point Continue at GammaInfra and one entry covers every provider, with task-aware routing or per-role pins.

The pain Continue users hit

One entry per model, per role. A typical Continue config.json has separate model entries for chat, autocomplete, and edit. Each gets a provider, key, model name — and if you're using OpenAI for chat + Anthropic for autocomplete + a local model for apply, you maintain three sets of credentials.
Switching models means re-editing JSON. Try a new model, fight the config, restart the extension.
No unified cost picture. Continue doesn't show cost in the UI, so cost tracking lives elsewhere.

What changes with GammaInfra

Point Continue's openai provider at GammaInfra's smart router — one entry gives you the whole catalog through the OpenAI SDK shape Continue already speaks. Direct-pin specific models when you want; let gammainfra/auto decide when you don't.

Setup

1. Get a GammaInfra API key

Sign up at gammainfra.com and verify your email. $3 trial credit covers thousands of autocomplete completions or a few hundred Chat-panel responses.

2. Open Continue's config

Click the gear icon in the Continue panel (VS Code or JetBrains), or open ~/.continue/config.json directly. The file is JSON with models, tabAutocompleteModel, and other top-level keys.

3. Add a GammaInfra model entry

The minimal addition:

{
  "models": [
    {
      "title": "GammaInfra Auto",
      "provider": "openai",
      "model": "gammainfra/auto",
      "apiBase": "https://api.gammainfra.com/v1",
      "apiKey": "sk-gammainfra-..."
    }
  ]
}

Restart the Continue panel (Cmd/Ctrl + Shift + P → "Continue: Reload"). The "GammaInfra Auto" entry now appears in the model picker.

4. Configure per-role models (optional)

For sharper control, give each editor role its own model — still all going through GammaInfra so you keep the single API key:

{
  "models": [
    {
      "title": "GammaInfra Auto",
      "provider": "openai",
      "model": "gammainfra/auto",
      "apiBase": "https://api.gammainfra.com/v1",
      "apiKey": "sk-gammainfra-..."
    },
    {
      "title": "Claude Opus",
      "provider": "openai",
      "model": "anthropic/claude-opus-4-7",
      "apiBase": "https://api.gammainfra.com/v1",
      "apiKey": "sk-gammainfra-..."
    }
  ],
  "tabAutocompleteModel": {
    "title": "Fast autocomplete",
    "provider": "openai",
    "model": "gammainfra/fast",
    "apiBase": "https://api.gammainfra.com/v1",
    "apiKey": "sk-gammainfra-..."
  }
}

Now: Tab autocomplete uses the latency-optimized hedged path. Chat panel can switch between "GammaInfra Auto" (smart routing) and "Claude Opus" (forced quality tier).

Apply / edit roles: Continue's "apply" and "edit" roles benefit from precise diff-aware models. Pin anthropic/claude-sonnet-4-6 or openai/gpt-5 for these via explicit role-model entries (see Continue's docs on roleModels) — same GammaInfra endpoint, different model name.

Model name choices

gammainfra/auto — task-aware routing. Reasonable default for the main Chat panel.
gammainfra/fast — latency-optimized. Good for tabAutocompleteModel.
gammainfra/cheap — cost-optimized. Good for high-volume scripts running through Continue's API.
anthropic/claude-opus-4-7, openai/gpt-5, google/gemini-3.1-pro-preview — direct provider pins for quality tier.
deepseek/deepseek-v4-pro — cheaper reasoning tier with built-in thinking mode.

Tips

Use multiple entries to A/B compare. Add "GammaInfra + Opus" and "GammaInfra + GPT-5" as two entries — switch in the model picker without restarting Continue.
Cost reporting. Continue doesn't show cost natively. The GammaInfra dashboard at dashboard.gammainfra.com shows per-request cost rolled up by model and date.
Free-tier Continue users: If you're using Continue's free tier with API keys (instead of Continue Pro), GammaInfra's pass-through model means you only pay actual provider rates plus the top-up fee — no per-completion markup.

Trade-offs

Latency. ~10–50 ms overhead per request. For autocomplete this is borderline noticeable; consider gammainfra/fast (hedged) which often beats going direct.
Cost. Pass-through token rates + 3% top-up fee (launch window, 5% standard). BYOK at 1–2% per request if you bring your own provider keys.
Privacy. Prompts and responses aren't logged by default. Privacy policy.

Ready to try it?

Get a GammaInfra API key →

$3 free trial credit on signup, $10 minimum top-up. Pass-through provider token rates plus 3% top-up fee during the launch window.

Frequently asked questions

How do I point Continue at GammaInfra?

Continue uses an OpenAI-compatible provider entry: an openai-type provider with base URL https://api.gammainfra.com/v1, your sk-gammainfra-... key, and a model such as gammainfra/auto. Continue's config file and field names have changed across versions (older JSON config, newer YAML) — follow the setup steps above and Continue's current docs for the version you run. The GammaInfra-side values (base URL, key, model names) are identical regardless of Continue version.

Can I use a different model for Continue's autocomplete vs chat?

Yes. Continue lets you set the inline-autocomplete model separately from the chat model(s). Point autocomplete at gammainfra/fast (latency-optimized) and chat at gammainfra/auto or a pinned model like anthropic/claude-opus-4-7. The exact field names for the two slots differ by Continue version — see Continue's current docs; on the GammaInfra side each slot is just a model name, nothing special is required.

Does GammaInfra preserve Continue's MCP integration?

MCP tool registration in Continue runs locally and is independent of the model provider. GammaInfra simply receives the resulting tool definitions in the tools[] field of each request and forwards them to the resolved provider. Tool calling works end-to-end across the providers GammaInfra routes to (OpenAI, Anthropic, Google, Mistral, Groq, DeepSeek).

Can I see per-completion cost in the Continue panel?

Continue doesn't currently surface response headers. Use the GammaInfra dashboard for per-request cost. The dashboard's filtering by API key lets you isolate Continue activity if you use a key dedicated to that editor.

What happens during Continue's long-running edits if a provider times out?

The router cascades to the next provider in the fallback chain (typically a different vendor) and Continue receives one successful response. Set X-GammaInfra-Max-Latency-Ms in the request headers (via Continue's custom-headers config) to cap the total time the gateway will spend cascading — useful if you want hard latency guarantees on edit flows.