Models
Every major LLM through GammaInfra's smart router. Pricing is pass-through — provider rates with a 3% top-up fee during the launch window (5% standard), no per-token markup. Click a model for the full spec sheet.
Routing options:
gammainfra/auto— task-aware routing picks the best-fit model per promptgammainfra/fast— latency-optimized with hedged requestsgammainfra/cheap— cost-optimized routing- Or pin any model below by its
provider/slugdirectly
OpenAI
| Slug | Input / 1M | Output / 1M | Context | Tools | Vision | JSON |
|---|---|---|---|---|---|---|
| openai/gpt-5 | $1.25 | $10.00 | 256K | ✓ | ✓ | ✓ |
| openai/gpt-5-mini | $0.25 | $2.00 | 256K | ✓ | ✓ | ✓ |
| openai/gpt-5-nano | $0.05 | $0.40 | 256K | ✓ | — | ✓ |
| openai/gpt-5.5 | $5.00 | $30.00 | 272K | ✓ | ✓ | ✓ |
Anthropic
| Slug | Input / 1M | Output / 1M | Context | Tools | Vision | JSON |
|---|---|---|---|---|---|---|
| anthropic/claude-opus-4-7 | $5.00 | $25.00 | 200K | ✓ | ✓ | ✓ |
| anthropic/claude-opus-4-6 | $5.00 | $25.00 | 200K | ✓ | ✓ | ✓ |
| anthropic/claude-sonnet-4-6 | $3.00 | $15.00 | 200K | ✓ | ✓ | ✓ |
| anthropic/claude-haiku-4-5 | $0.80 | $4.00 | 200K | ✓ | ✓ | ✓ |
| Slug | Input / 1M | Output / 1M | Context | Tools | Vision | JSON |
|---|---|---|---|---|---|---|
| google/gemini-3.1-pro-preview | $2.50 | $15.00 | 2M | ✓ | ✓ | ✓ |
| google/gemini-3-flash-preview | $0.30 | $2.50 | 1M | ✓ | ✓ | ✓ |
| google/gemini-3.1-flash-lite-preview | $0.07 | $0.30 | 1M | ✓ | ✓ | ✓ |
DeepSeek
| Slug | Input / 1M | Output / 1M | Context | Tools | Vision | JSON |
|---|---|---|---|---|---|---|
| deepseek/deepseek-v4-pro | $0.50 | $2.00 | 128K | ✓ | — | ✓ |
| deepseek/deepseek-v4-flash | $0.14 | $0.55 | 128K | ✓ | — | ✓ |
Mistral
| Slug | Input / 1M | Output / 1M | Context | Tools | Vision | JSON |
|---|---|---|---|---|---|---|
| mistral/mistral-large-2512 | $2.00 | $6.00 | 128K | ✓ | — | ✓ |
| mistral/mistral-small-2603 | $0.20 | $0.60 | 128K | ✓ | — | ✓ |
| mistral/devstral-2512 | $0.40 | $1.60 | 128K | ✓ | — | ✓ |
Groq (Llama)
| Slug | Input / 1M | Output / 1M | Context | Tools | Vision | JSON |
|---|---|---|---|---|---|---|
| groq/llama-3.3-70b-versatile | $0.59 | $0.79 | 128K | ✓ | — | ✓ |
| groq/llama-3.1-8b-instant | $0.05 | $0.08 | 128K | ✓ | — | ✓ |
xAI (Grok)
| Slug | Input / 1M | Output / 1M | Context | Tools | Vision | JSON |
|---|---|---|---|---|---|---|
| grok/grok-4-1-fast-non-reasoning | $0.30 | $1.20 | 256K | ✓ | — | ✓ |
Amazon Bedrock
| Slug | Input / 1M | Output / 1M | Context | Tools | Vision | JSON |
|---|---|---|---|---|---|---|
| bedrock/us.anthropic.claude-opus-4-7 | $5.00 | $25.00 | 200K | ✓ | ✓ | ✓ |
| bedrock/us.amazon.nova-pro-v1:0 | $0.80 | $3.20 | 300K | ✓ | ✓ | ✓ |
Frequently asked questions
Which models can I call through GammaInfra?
Models from every major provider — OpenAI, Anthropic, Google, Mistral, Groq, DeepSeek, xAI, and Amazon Bedrock. Browse the catalog for per-model pricing and capabilities. Pin any of them directly with a
provider/model name, or use gammainfra/auto and let the router pick per prompt.How is model pricing shown?
Each model page lists the provider pass-through input and output rate per 1M tokens. GammaInfra adds 0% markup on tokens; its fee is taken at top-up (3% launch / 5% standard) or 1–2% per request on BYOK. Every response also returns the exact spend in
X-GammaInfra-Cost-USD.Do I have to pick a model?
No.
gammainfra/auto classifies each prompt by task and routes it to the best-fit model automatically, with cross-provider fallback. Pin a specific model only when the choice is part of your application's design rather than a routing variable.How current is the model catalog?
The catalog tracks the providers' shipping models; new models are added as provider adapters are wired in. Pricing and capability fields on each model page are maintained against the providers' published rates — the
X-GammaInfra-Cost-USD response header is always authoritative for actual spend.