Cost Optimization

Use the Right Mode

Mode	Cost	Use When
`fast`	Lowest	Simple Q&A, classification, short responses
`auto`	Variable	Let Theo decide — usually picks the cheapest adequate model
`think`	Higher	Complex reasoning, multi-step analysis
`code`	Higher	Code generation, architecture design

Set mode: "fast" for simple tasks — routes to Theo Flash (cheapest)
Use semantic caching — identical requests are free after the first call
Monitor usage — GET /api/v1/usage shows cost breakdown by mode and model
Set daily caps — configure spending limits in the billing dashboard
Use per-request skills instead of installing everything — fewer skills = less prompt tokens
Lower max_iterations — if you know the task needs only 1-2 tool calls, set max_iterations: 3