Blog

Q: How do I try the routing behavior described in these posts?

Point any OpenAI-compatible SDK at https://api.gammainfra.com/v1 with a GammaInfra key and use model gammainfra/auto. Every behavior the posts describe — per-step routing, the cost-quality dial, the cost-USD header, fallback chains — is live on that endpoint with no extra configuration.

Posts on how we build GammaInfra — smart routing design, the bug stories from running it in production, and the methodology behind shipping a solo-founder LLM routing service.

2026-05-14

Every agent step deserves a different model

Agent loops compound every weakness of using one model for everything. A field guide to per-step model variance, tail-latency budgets, fallback chains, cost-runaway observability, and the cross-provider tool_call.id papercut — with code samples for Claude Agent SDK, LangGraph, and OpenAI Agents SDK.

2026-05-09

Why every GammaInfra response carries a cost-USD header

LLM API providers report tokens. Developers care about dollars. How we compute X-GammaInfra-Cost-USD per request, why it's harder than it sounds (per-direction split, long-context surcharges, fallback cascades), and how to read it from common SDKs.

2026-04-23

Designing a continuous cost/quality dial for LLM routing

Most LLM routers force a discrete tier — cheap, balanced, quality. We added a continuous 0.0..1.0 dial via one request header. Why continuous beats discrete, what we tried and threw away, how it maps to actual model picks.

2026-05-25 (planned)

Show HN retro — what we learned launching GammaInfraAfter May 14

A founder's retro on launching GammaInfra's smart routing service to Show HN — front-page traffic numbers, comment-thread themes, the bugs HN-day traffic surfaced, and what's next.

Frequently asked questions

What does the GammaInfra blog cover?

Engineering write-ups on smart LLM routing: per-step model selection in agent loops, the continuous cost/quality dial design, why every response carries a cost-USD header, and the trade-offs behind the routing decisions. It is the deeper-dive layer above the API docs and glossary.

Is GammaInfra's routing approach documented elsewhere?

Yes — the API reference is at docs.gammainfra.com and conceptual definitions are in the glossary. The blog is where design rationale and measured results live; the docs are the reference contract.

How do I try the routing behavior described in these posts?

Point any OpenAI-compatible SDK at https://api.gammainfra.com/v1 with a GammaInfra key and use model gammainfra/auto. Every behavior the posts describe — per-step routing, the cost-quality dial, the cost-USD header, fallback chains — is live on that endpoint with no extra configuration.

How often is the blog updated?

Posts ship alongside notable routing or observability work rather than on a fixed calendar. Each post is dated and the engineering claims are reproducible against the live API at publication time.