Use the Right Mode
| Mode | Cost | Use When |
|---|---|---|
fast | Lowest | Simple Q&A, classification, short responses |
auto | Variable | Let Theo decide — usually picks the cheapest adequate model |
think | Higher | Complex reasoning, multi-step analysis |
code | Higher | Code generation, architecture design |
Tips
- Set
mode: "fast"for simple tasks — routes to Theo Flash (cheapest) - Use semantic caching — identical requests are free after the first call
- Monitor usage —
GET /api/v1/usageshows cost breakdown by mode and model - Set daily caps — configure spending limits in the billing dashboard
- Use per-request skills instead of installing everything — fewer skills = less prompt tokens
- Lower
max_iterations— if you know the task needs only 1-2 tool calls, setmax_iterations: 3
