Skip to main content
June 12, 2026 · 3 min read

How smart model routing cuts AI costs

Why sending every message to a premium model is wasteful — and how Telsi routes across 25 models instead.

Ask a frontier model "what's 15% of 80?" and you'll get the right answer — delivered by one of the most expensive computational artifacts ever built. It works, but it's like hiring a senior partner to staple documents. Multiply that by every message a chatty assistant handles in a month, and the economics get silly.

Telsi takes a different approach: smart model routing, powered by Solvela.

The core idea

Not all messages are equal. A quick factual lookup, a casual reply, a date calculation — small, fast models handle these perfectly well at a fraction of the cost. Multi-step reasoning, subtle writing, tricky code — that's where premium models earn their price.

So instead of pinning your assistant to one model, every message you send is classified first: how hard is this, really? Simple requests route to cheap, fast models. Complex ones route to premium models. You always get an answer from a model that's good enough for the job — and you stop paying frontier prices for stapler work.

Across a realistic mix of everyday messages, routing cuts model spend by up to ~80% compared with sending everything to a single premium model.

25 models, one assistant

Your Telsi assistant routes across 25 models spanning multiple providers — from lightweight workhorses to top-tier reasoning models. That breadth matters for more than cost:

  • Fit: a model that's great at code isn't always the best at warm conversational tone. Routing picks per-message, not per-account.
  • Resilience: one provider having a bad day doesn't take your assistant down with it.
  • Progress without migration: when better or cheaper models ship, they join the pool — your assistant improves without you touching anything.

Paying per request, not per platform

Under the hood there's a second unusual choice: how the model calls are paid for. Each request from your instance is settled as a per-request micropayment on Solana — the routing layer pays each model provider for exactly the compute your message used, request by request, in USDC.

Why should you care about plumbing? Because it removes the usual middle layer of bulk API contracts and pre-committed spend that platforms typically smear across their customers as margin. Costs stay proportional to actual usage — which is exactly what makes a generous allowance like 20,000 messages a month sustainable at $29.

What this means in practice

You don't see any of this machinery. You just chat in Telegram. But it's why the experience holds up:

  • Simple questions come back fast — they're not queueing behind a frontier model
  • Hard questions still get premium-model quality
  • Your monthly allowance stretches much further than it would on a single-model service

Routing is one of those features that's invisible when it works — the only place you'd notice it is the price tag.

Want to feel it rather than read about it? Set up your assistant in ten minutes and send it both a dumb question and a hard one.