Question 1

What does an LLM router actually decide?

Accepted Answer

For each incoming request, the router picks (a) which provider to dispatch to, (b) which specific model on that provider, (c) the order of fallback candidates if the first choice fails. Some routers also decide whether to hedge — fire two providers in parallel and take the first success.

Question 2

Rule-based vs learned routing — which is better?

Accepted Answer

Rule-based routing is predictable, debuggable, and works on day one with no training data. Learned routing requires accumulated quality signals to train on but can capture subtle prompt-to-model fit that rules miss. Production routers usually combine both — rules for high-confidence shortcuts, a learned classifier for the rest.

Question 3

Can callers override the router?

Accepted Answer

Yes, in two ways. (1) Pin a specific model in the request: model=anthropic/claude-opus-4-7 bypasses smart routing entirely. (2) Use preference hints: X-GammaInfra-Preference: quality biases toward stronger models, X-GammaInfra-Preference: cost biases toward cheaper ones, X-GammaInfra-Cost-Quality: 0.3 is a continuous dial.

Question 4

How does the router avoid getting stuck on a slow provider?

Accepted Answer

Two mechanisms. First, a max-latency budget per request via X-GammaInfra-Max-Latency-Ms — the upstream call is cancelled and the request 504s if the budget is exceeded. Second, live p50 latency monitoring (refreshed every 30 seconds, 5-minute window) updates the router's preference ordering so a chronically slow provider drops in priority automatically.

Question 5

What inputs does a learned LLM router use?

Accepted Answer

Typically: a prompt embedding (e.g. MiniLM, BGE-base), the presence of attachments (images, audio), the tools/functions array, the requested response_format, the caller's preference signal, and live operational signals (provider health, live p50 latency, cost). The classifier maps these to a logical-label distribution which then resolves through a per-label endpoint registry.

What is an LLM router?

Why routing exists

Routing strategies in practice

Rule-based routing

Learned (ML-based) routing

Caller-driven routing

Hybrid (most production routers)

How GammaInfra's router works

Common questions

Try the router