Use Cursor with GammaInfra
Point Cursor at GammaInfra for per-completion cost visibility, automatic provider fallback when one provider rate-limits, and per-mode model pinning. Setup takes about two minutes; no extension, no plugin.
The pain
Cursor's BYOK path uses your provider API key directly. That's fine until you hit either of two walls:
- No per-completion cost visibility. A heavy autocomplete session can burn through tokens fast and you only see the damage on the provider's monthly invoice.
- Provider throttling kills sessions. When OpenAI or Anthropic rate-limits your Cursor activity, Cursor stops dead. Manual mitigation is "switch your config to a different provider," which means setting it up again.
What changes with GammaInfra
Point Cursor at GammaInfra's smart router (https://api.gammainfra.com/v1) instead of https://api.openai.com/v1 — same OpenAI SDK shape, three things start happening:
- Every completion response carries
X-GammaInfra-Cost-USD,X-GammaInfra-Input-Cost-USD, andX-GammaInfra-Output-Cost-USDheaders. Sum across a session and you know exactly what that 200-completion refactor cost. - The fallback chain absorbs provider rate-limits. When OpenAI throttles your key, GammaInfra cascades to the next provider in the chain — typically Anthropic, then Google, then Mistral — transparently. Cursor sees a 200, your refactor finishes.
- You can mix models per Cursor mode.
gammainfra/autofor inline autocomplete (task-aware routing picks the cheap fast model),anthropic/claude-opus-4-7for the Chat panel when you want quality. One key, many endpoints.
Setup
1. Get a GammaInfra API key
Sign up at gammainfra.com and verify your email. New accounts get $3 of trial credit after verification — enough for a few hundred Cursor completions to confirm the flow works before you top up. Copy the API key from the dashboard.
2. Open Cursor's model settings
In Cursor:
- Cmd/Ctrl + , (or open Settings from the menu)
- Navigate to Models → OpenAI API Key
- Toggle on Override OpenAI Base URL
3. Set base URL and key
Two fields:
OpenAI Base URL: https://api.gammainfra.com/v1
OpenAI API Key: sk-gammainfra-... (paste your GammaInfra key here)
Click Verify. Cursor will make a test request to confirm the endpoint responds.
4. Pick a model
In the same settings panel, set the model name. Some choices:
gammainfra/auto— task-aware routing. Cursor's autocomplete prompts route to fast cheap models; longer Chat-panel prompts route to higher-quality models. This is the default we recommend for most users.gammainfra/fast— latency-optimized. Hedged requests when enabled, p50-latency-aware picker. Best for inline completion.gammainfra/cheap— cost-optimized. Cheapest viable model for the prompt.openai/gpt-5-mini— pin OpenAI's small model directly.anthropic/claude-opus-4-7— pin Claude Opus.anthropic/claude-sonnet-4-6— pin Claude Sonnet, balanced quality + speed.
gammainfra/fast and Chat to anthropic/claude-opus-4-7 — you get cheap fast completions while typing, quality reasoning when you ask a hard question.
Verify it's working
Trigger any completion in Cursor (start typing a function, ask the Chat panel a question). Then check your GammaInfra dashboard at dashboard.gammainfra.com:
- Your request appears in the request log with the resolved provider and model.
- Cost is itemized per request.
- If you check Cursor's network panel (DevTools), you'll see
X-GammaInfra-Cost-USDon the response headers.
Trade-offs to know about
- Latency. Routing through GammaInfra adds 10–50 ms of overhead vs hitting OpenAI directly. For Cursor's autocomplete (TTFB-sensitive), this is usually imperceptible. The hedged-request feature on
gammainfra/fastcan actually reduce p50 latency vs going direct, since it races two providers in parallel. - Cost. GammaInfra charges a top-up fee (3% during the launch window, 5% standard) rather than marking up tokens. For a Cursor user spending $50/month on completions, that's $1.50–$2.50/month on top — the cost-visibility and fallback features generally pay for themselves the first time a provider throttles your session.
- Privacy. GammaInfra doesn't log prompts or responses by default. See the privacy policy for the full picture.
Bring your own provider keys (BYOK)
If you have existing relationships with OpenAI/Anthropic/etc. and want to keep using your own provider keys directly, you can add them to GammaInfra via the dashboard's Provider Keys tab. GammaInfra will route through your keys when present (BYOK mode) and only charge a 1–2% per-request fee on the retail cost. Use this when you have provider credits to burn or when you want direct provider billing.
Troubleshoot
- 401 Unauthorized. The API key is wrong or hasn't been activated. Sign in to the dashboard and re-copy it.
- 402 Payment required. Your trial credit is exhausted. Top up the dashboard or wait for next month's free tier.
- 429 Rate limit exceeded. You hit the default 240-requests-per-minute rate limit. Slow down or contact us — heavy Cursor users get higher limits on request.
- 503 Service unavailable. All providers in the fallback chain are down. Rare; usually self-resolves within seconds.
Detailed error codes are in the docs. Stuck? Open a ticket in Discord — we usually reply within an hour.
Ready to try it?
$3 free trial credit on signup, $10 minimum top-up. Pass-through provider token rates plus 3% top-up fee during the launch window (5% after 2026-06-23).
Frequently asked questions
Does using GammaInfra with Cursor add noticeable latency?
gammainfra/fast as the model, hedged requests can actually reduce p95 latency vs going direct because the gateway races two providers in parallel and takes the first success.Can I see per-completion cost in Cursor?
X-GammaInfra-Cost-USD on completion responses directly.Can I configure different models for Cursor's inline autocomplete vs the Chat panel?
gammainfra/fast (latency-optimized) and the Chat panel to anthropic/claude-opus-4-7 (quality flagship). You get cheap fast completions while typing and quality reasoning when you ask a hard question.What happens if my preferred provider rate-limits while I'm coding?
X-GammaInfra-Fallback-Chain on the response.