Skip to main content
Not every task needs the same model. Ultron routes requests to the right model based on what the task actually requires — keeping complex reasoning powerful and simple lookups fast.

Routing logic

Task typeModelRationale
Research, competitive analysisKimi K2Deep reasoning, multi-source synthesis
Cold email, content creationKimi K2Voice matching, quality gate compliance
Objection handling, strategyKimi K2Nuanced reasoning, context awareness
Lead scoringFast modelPattern matching, high throughput
Quick lookups, summariesFast modelSpeed matters, complexity doesn’t
Routing decisionsFast modelLightweight classification
Memory compressionFast modelSummarization at scale

How routing is decided

The decision happens before the API call based on:
  1. Message complexity — length, number of entities, implied reasoning depth
  2. Task type — skill runs always use the model specified in their SKILL.md config
  3. User plan — free plan routes more aggressively to fast models
  4. Token estimate — very short tasks default to fast even if complex-looking

Credit impact

Kimi K2 costs more per token than the fast model. The routing system keeps your credits efficient by:
  • Routing memory compression, scoring, and routing decisions to the fast model
  • Reserving Kimi K2 for tasks where output quality directly affects business outcomes
  • Giving you visibility into model usage per session in Settings → Usage
Each skill’s SKILL.md file specifies which model it uses. Research and sales skills use Kimi K2. Ops skills like morning briefing and pipeline review use the fast model.