Use Vercel AI SDK with GammaInfra
The AI SDK's @ai-sdk/openai provider accepts a custom baseURL. Point it at GammaInfra and your existing generateText / streamText / generateObject calls reach every major LLM through one provider configuration — with per-request cost in response headers and automatic fallback when one provider rate-limits.
What changes with GammaInfra
- One provider for everything. Instead of importing
@ai-sdk/openai,@ai-sdk/anthropic,@ai-sdk/google, etc., import@ai-sdk/openaionce and point it at GammaInfra. The full model catalog is reachable through model-name strings. - Edge-runtime ready. GammaInfra speaks plain HTTPS over the OpenAI SDK shape, so it works in Vercel Edge Functions, Cloudflare Workers, Bun, Deno, and anywhere else the AI SDK runs.
- Cost visibility. Per-request cost is in the response header
X-GammaInfra-Cost-USD. The dashboard rolls it up by model and day.
Setup
1. Get a GammaInfra API key
Sign up at gammainfra.com and verify your email.
2. Install packages
npm install ai @ai-sdk/openai
# or pnpm / bun / yarn
3. Configure createOpenAI with the GammaInfra baseURL
import { createOpenAI } from '@ai-sdk/openai';
const kraken = createOpenAI({
baseURL: 'https://api.gammainfra.com/v1',
apiKey: process.env.KRAKEN_API_KEY, // sk-gammainfra-...
});
4. Use it like any other AI SDK provider
Generate text:
import { generateText } from 'ai';
const { text, usage } = await generateText({
model: kraken('gammainfra/auto'),
prompt: 'Explain hedged requests in three sentences.',
});
console.log(text);
console.log('Tokens:', usage);
Stream:
import { streamText } from 'ai';
const result = streamText({
model: kraken('gammainfra/fast'),
prompt: 'Write a haiku about hedged requests.',
});
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}
Structured output:
import { generateObject } from 'ai';
import { z } from 'zod';
const { object } = await generateObject({
model: kraken('anthropic/claude-opus-4-7'),
schema: z.object({
city: z.string(),
temperature_f: z.number(),
conditions: z.string(),
}),
prompt: 'Generate a weather report for Tokyo.',
});
console.log(object.temperature_f);
Tool calling:
import { generateText, tool } from 'ai';
import { z } from 'zod';
const { text, toolCalls } = await generateText({
model: kraken('anthropic/claude-opus-4-7'),
tools: {
getWeather: tool({
description: 'Get the current weather in a city.',
parameters: z.object({ city: z.string() }),
execute: async ({ city }) => `72°F sunny in ${city}`,
}),
},
prompt: 'What is the weather in Tokyo?',
});
toolu_* ↔ call_* tool-call IDs at the boundary, so the AI SDK's tool-loop semantics work whether the underlying model is OpenAI or Anthropic. No special handling needed.
Recommended models for AI SDK workloads
gammainfra/auto— default. Task-aware routing.gammainfra/fast— latency-optimized with hedged requests. Good for streaming UX.anthropic/claude-opus-4-7— quality reasoning.generateObjectwith complex schemas.openai/gpt-5,openai/gpt-5-mini— OpenAI direct pins.deepseek/deepseek-v4-pro— cheap reasoning option with thinking mode.- Full list at
GET /v1/models.
Bonus features only GammaInfra adds
Pass custom headers via the AI SDK's headers option:
const { text } = await generateText({
model: kraken('gammainfra/auto'),
prompt: '...',
headers: {
'X-GammaInfra-Cost-Quality': '0.3', // 0=quality, 1=cost
'X-GammaInfra-Max-Latency-Ms': '20000', // 504 on overrun
},
});
Edge runtime example (Next.js App Router)
// app/api/chat/route.ts
import { createOpenAI } from '@ai-sdk/openai';
import { streamText } from 'ai';
const kraken = createOpenAI({
baseURL: 'https://api.gammainfra.com/v1',
apiKey: process.env.KRAKEN_API_KEY,
});
export const runtime = 'edge';
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
model: kraken('gammainfra/auto'),
messages,
});
return result.toAIStreamResponse();
}
Reading the cost header
The AI SDK's high-level functions don't expose response headers directly. To read X-GammaInfra-Cost-USD per call, hook into the OpenAI provider's lower-level layer:
import { createOpenAI } from '@ai-sdk/openai';
const kraken = createOpenAI({
baseURL: 'https://api.gammainfra.com/v1',
apiKey: process.env.KRAKEN_API_KEY,
fetch: async (input, init) => {
const res = await fetch(input, init);
console.log('GammaInfra cost:', res.headers.get('X-GammaInfra-Cost-USD'));
console.log('Endpoint:', res.headers.get('X-GammaInfra-Endpoint'));
return res;
},
});
For aggregate cost across many calls, the dashboard at dashboard.gammainfra.com shows per-request itemization.
Trade-offs
- Latency. ~10–50 ms overhead vs going direct. Can be net-negative for
gammainfra/fastwith hedging. - Cost. 3% top-up fee (launch) / 5% standard on managed credits. Pass-through provider rates on tokens — no markup. BYOK at 1–2% per request alternative.
- Privacy. Prompts and responses aren't logged by default. Privacy policy.
Ready to try it?
$3 free trial credit on signup, $10 minimum top-up. Pass-through provider token rates plus 3% top-up fee during the launch window.
Frequently asked questions
How do I configure @ai-sdk/openai for GammaInfra?
createOpenAI from @ai-sdk/openai with baseURL: 'https://api.gammainfra.com/v1' and apiKey: process.env.GAMMAINFRA_API_KEY. Then use it like any AI SDK provider: const gammainfra = createOpenAI({ baseURL, apiKey }); then gammainfra('gammainfra/auto') in your streamText or generateText calls.Does streamText() work with GammaInfra's streaming?
streamText receives chunks in the same format it expects from native OpenAI. All AI SDK features — tool calls, text streaming, structured output via generateObject / streamObject — work end-to-end.Can I access GammaInfra's cost headers in my Vercel app?
x-gammainfra-cost-usd plus the input/output split) on every response. The AI SDK surfaces upstream response headers on the result of generateText / streamText, and via an onFinish callback for streaming — consult the AI SDK's current docs for the exact accessor in your SDK version, then read the x-gammainfra-cost-usd header from it. The header names on the GammaInfra side are stable across SDK versions.Does GammaInfra support the AI SDK's tool calling pattern?
tool() helper from @ai-sdk/openai, pass them to streamText or generateText, and the AI SDK serializes them into the request's tools[] field. GammaInfra forwards to the resolved provider and translates tool_call.id formats across providers so the AI SDK's tool-result round-trip works regardless of which vendor served the call.What about generative UI with useObject and Server Components?
streamObject and the React hooks layer (useObject) treat the underlying provider as opaque. Once you've created the GammaInfra provider via createOpenAI, all higher-level abstractions — RSC streaming, generative UI, useChat — work unchanged. The router runs server-side per request, fully transparent to the client component.