GPT-5 mini
openai/gpt-5-mini
OpenAI's small workhorse. Cheap enough to head the chat fallback chain on GammaInfra. Heads most cost-sensitive dispatcher chains.
Pricing
| Direction | USD per 1M tokens | USD per 1K tokens |
|---|---|---|
| Input | $0.2500 | $0.000250 |
| Output | $2.0000 | $0.002000 |
Pass-through provider rates via GammaInfra. No per-token markup. A 3% top-up fee (launch window through 2026-06-23, then 5%) applies on managed credits; the BYOK alternative is 1–2% per request.
Capabilities
Specifications
| Field | Value |
|---|---|
| Context window | 256K |
| Max output | 128K |
| Provider | openai |
| Streaming | Yes — OpenAI-compatible SSE |
Best for
Task labels reflect where this model heads or appears in GammaInfra's default dispatcher chains. Override per-request with the X-GammaInfra-Preference or X-GammaInfra-Cost-Quality header.
How to call it
Through GammaInfra's smart router with one of your GammaInfra API keys:
curl https://api.gammainfra.com/v1/chat/completions \
-H "Authorization: Bearer sk-gammainfra-..." \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-5-mini",
"messages": [
{"role": "user", "content": "Hello, GPT-5 mini!"}
]
}'
Or via any OpenAI SDK — see the integrations page for setup with Cursor, Cline, LangChain, the Vercel AI SDK, and others.
Smart routing — or pin this model
You can call openai/gpt-5-mini directly (as above), or let GammaInfra's router pick the best-fit model per prompt. Use gammainfra/auto as the model name for task-aware routing, gammainfra/fast for latency-optimized hedged requests, or gammainfra/cheap for cost-optimized routing. The router considers task type, latency, and your X-GammaInfra-Cost-Quality dial when picking.
Related models
Ready to try it?
$3 free trial credit on signup, $10 minimum top-up. Pass-through provider token rates plus 3% top-up fee during the launch window (5% after 2026-06-23).
Frequently asked questions
How much does GPT-5 mini cost through GammaInfra?
openai/gpt-5-mini) is billed at the OpenAI pass-through rate — $0.25 per 1M input tokens and $2.0 per 1M output tokens, with 0% token markup. GammaInfra's fee is taken at top-up time (3% during the launch window through 2026-06-23, 5% after), not per token; the BYOK option is 1–2% per request instead. Every response returns X-GammaInfra-Cost-USD with the exact spend for that call.What is GPT-5 mini's context window?
Does GPT-5 mini support tool calling, vision, and JSON mode?
How do I call GPT-5 mini through GammaInfra?
https://api.gammainfra.com/v1 with your sk-gammainfra-... key, then set the model to openai/gpt-5-mini to pin GPT-5 mini directly — or use gammainfra/auto to let the smart router pick it when it is the best fit. Only base_url and api_key change; the rest of your OpenAI SDK code is unchanged.When does GammaInfra's router pick GPT-5 mini?
gammainfra/auto the router selects it when a prompt classifies into one of those task types and your cost/quality preference fits; pin openai/gpt-5-mini to force it regardless of routing. The X-GammaInfra-Endpoint response header always reports which model actually served the request.