Use Vercel AI SDK with GammaInfra

The AI SDK's @ai-sdk/openai provider accepts a custom baseURL. Point it at GammaInfra and your existing generateText / streamText / generateObject calls reach every major LLM through one provider configuration — with per-request cost in response headers and automatic fallback when one provider rate-limits.

What changes with GammaInfra

One provider for everything. Instead of importing @ai-sdk/openai, @ai-sdk/anthropic, @ai-sdk/google, etc., import @ai-sdk/openai once and point it at GammaInfra. The full model catalog is reachable through model-name strings.
Edge-runtime ready. GammaInfra speaks plain HTTPS over the OpenAI SDK shape, so it works in Vercel Edge Functions, Cloudflare Workers, Bun, Deno, and anywhere else the AI SDK runs.
Cost visibility. Per-request cost is in the response header X-GammaInfra-Cost-USD. The dashboard rolls it up by model and day.

Setup

1. Get a GammaInfra API key

2. Install packages

npm install ai @ai-sdk/openai
# or pnpm / bun / yarn

3. Configure createOpenAI with the GammaInfra baseURL

import { createOpenAI } from '@ai-sdk/openai';

const kraken = createOpenAI({
  baseURL: 'https://api.gammainfra.com/v1',
  apiKey: process.env.KRAKEN_API_KEY,   // sk-gammainfra-...
});

4. Use it like any other AI SDK provider

Generate text:

import { generateText } from 'ai';

const { text, usage } = await generateText({
  model: kraken('gammainfra/auto'),
  prompt: 'Explain hedged requests in three sentences.',
});

console.log(text);
console.log('Tokens:', usage);

Stream:

import { streamText } from 'ai';

const result = streamText({
  model: kraken('gammainfra/fast'),
  prompt: 'Write a haiku about hedged requests.',
});

for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}

Structured output:

import { generateObject } from 'ai';
import { z } from 'zod';

const { object } = await generateObject({
  model: kraken('anthropic/claude-opus-4-7'),
  schema: z.object({
    city: z.string(),
    temperature_f: z.number(),
    conditions: z.string(),
  }),
  prompt: 'Generate a weather report for Tokyo.',
});

console.log(object.temperature_f);

Tool calling:

import { generateText, tool } from 'ai';
import { z } from 'zod';

const { text, toolCalls } = await generateText({
  model: kraken('anthropic/claude-opus-4-7'),
  tools: {
    getWeather: tool({
      description: 'Get the current weather in a city.',
      parameters: z.object({ city: z.string() }),
      execute: async ({ city }) => `72°F sunny in ${city}`,
    }),
  },
  prompt: 'What is the weather in Tokyo?',
});

Tool-call IDs across providers: GammaInfra translates toolu_* ↔ call_* tool-call IDs at the boundary, so the AI SDK's tool-loop semantics work whether the underlying model is OpenAI or Anthropic. No special handling needed.

Recommended models for AI SDK workloads

gammainfra/auto — default. Task-aware routing.
gammainfra/fast — latency-optimized with hedged requests. Good for streaming UX.
anthropic/claude-opus-4-7 — quality reasoning. generateObject with complex schemas.
openai/gpt-5, openai/gpt-5-mini — OpenAI direct pins.
deepseek/deepseek-v4-pro — cheap reasoning option with thinking mode.
Full list at GET /v1/models.

Bonus features only GammaInfra adds

Pass custom headers via the AI SDK's headers option:

const { text } = await generateText({
  model: kraken('gammainfra/auto'),
  prompt: '...',
  headers: {
    'X-GammaInfra-Cost-Quality': '0.3',          // 0=quality, 1=cost
    'X-GammaInfra-Max-Latency-Ms': '20000',      // 504 on overrun
  },
});

Edge runtime example (Next.js App Router)

// app/api/chat/route.ts
import { createOpenAI } from '@ai-sdk/openai';
import { streamText } from 'ai';

const kraken = createOpenAI({
  baseURL: 'https://api.gammainfra.com/v1',
  apiKey: process.env.KRAKEN_API_KEY,
});

export const runtime = 'edge';

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = streamText({
    model: kraken('gammainfra/auto'),
    messages,
  });

  return result.toAIStreamResponse();
}

Reading the cost header

The AI SDK's high-level functions don't expose response headers directly. To read X-GammaInfra-Cost-USD per call, hook into the OpenAI provider's lower-level layer:

import { createOpenAI } from '@ai-sdk/openai';

const kraken = createOpenAI({
  baseURL: 'https://api.gammainfra.com/v1',
  apiKey: process.env.KRAKEN_API_KEY,
  fetch: async (input, init) => {
    const res = await fetch(input, init);
    console.log('GammaInfra cost:', res.headers.get('X-GammaInfra-Cost-USD'));
    console.log('Endpoint:',    res.headers.get('X-GammaInfra-Endpoint'));
    return res;
  },
});

For aggregate cost across many calls, the dashboard at dashboard.gammainfra.com shows per-request itemization.

Trade-offs

Latency. ~10–50 ms overhead vs going direct. Can be net-negative for gammainfra/fast with hedging.
Cost. 3% top-up fee (launch) / 5% standard on managed credits. Pass-through provider rates on tokens — no markup. BYOK at 1–2% per request alternative.
Privacy. Prompts and responses aren't logged by default. Privacy policy.

Ready to try it?

Get a GammaInfra API key →

$3 free trial credit on signup, $10 minimum top-up. Pass-through provider token rates plus 3% top-up fee during the launch window.

Frequently asked questions

How do I configure @ai-sdk/openai for GammaInfra?

Use createOpenAI from @ai-sdk/openai with baseURL: 'https://api.gammainfra.com/v1' and apiKey: process.env.GAMMAINFRA_API_KEY. Then use it like any AI SDK provider: const gammainfra = createOpenAI({ baseURL, apiKey }); then gammainfra('gammainfra/auto') in your streamText or generateText calls.

Does streamText() work with GammaInfra's streaming?

Yes. GammaInfra forwards the upstream SSE stream verbatim (after wire-format normalization across providers) so streamText receives chunks in the same format it expects from native OpenAI. All AI SDK features — tool calls, text streaming, structured output via generateObject / streamObject — work end-to-end.

Can I access GammaInfra's cost headers in my Vercel app?

Yes. GammaInfra returns the cost headers (x-gammainfra-cost-usd plus the input/output split) on every response. The AI SDK surfaces upstream response headers on the result of generateText / streamText, and via an onFinish callback for streaming — consult the AI SDK's current docs for the exact accessor in your SDK version, then read the x-gammainfra-cost-usd header from it. The header names on the GammaInfra side are stable across SDK versions.

Does GammaInfra support the AI SDK's tool calling pattern?

Yes. Define tools via the tool() helper from @ai-sdk/openai, pass them to streamText or generateText, and the AI SDK serializes them into the request's tools[] field. GammaInfra forwards to the resolved provider and translates tool_call.id formats across providers so the AI SDK's tool-result round-trip works regardless of which vendor served the call.

What about generative UI with useObject and Server Components?

streamObject and the React hooks layer (useObject) treat the underlying provider as opaque. Once you've created the GammaInfra provider via createOpenAI, all higher-level abstractions — RSC streaming, generative UI, useChat — work unchanged. The router runs server-side per request, fully transparent to the client component.