Completions & Streaming - Theo API Docs

`theo.complete(request)`

Sends a prompt through the full orchestration pipeline and returns the complete response.

const res = await theo.complete({
  prompt: "Explain microservices architecture",
  mode: "auto",           // or "fast", "think", "code", "research", etc.
  skills: ["deep-research"],
  conversation_id: "conv_abc123",  // optional — continue a conversation
  persona: { system_prompt: "You are a senior architect..." },
  tools: [/* inline tool definitions */],
  temperature: 0.7,
  max_iterations: 8,
  metadata: { source: "my-app" },
});

console.log(res.content);                  // Generated text
console.log(res.model.label);              // "Theo Reason"
console.log(res.resolved_mode);            // "think"
console.log(res.tools_used);               // [{ name, status, description? }]
console.log(res.usage.cost_cents);         // 0.15
console.log(res.usage.prompt_tokens);      // 42
console.log(res.usage.completion_tokens);  // 128
console.log(res.usage.total_tokens);       // 170
console.log(res.usage.cached);             // true when served from semantic cache
console.log(res.conversation_id);          // server-side conversation id or null
console.log(res.request_id);               // mirror of the X-Request-Id header

CompletionRequest

Field	Type	Required	Default	Description
`prompt`	`string`	✅	—	The prompt text
`mode`	`ChatMode`	—	`"auto"`	Execution mode
`conversation_id`	`string \| null`	—	—	Continue a persisted conversation (null accepted)
`conversation`	`InlineConversation`	—	—	Inline history envelope for stateless callers
`skills`	`string[]`	—	—	Skill slugs to activate
`tools`	`ToolDef[]`	—	—	Inline tool definitions
`persona`	`PersonaInput`	—	`"theo"`	Persona override
`personality_config`	`PersonalityConfigInput`	—	—	Trait intensities + uncensored mode override
`response_style`	`ResponseStyleInput`	—	—	Format / preciseness / intent override
`theo_branding`	`boolean`	—	—	`false` strips Theo personality/branding from this response
`brand_soul`	`boolean`	—	—	`false` disables the API key’s Brand Soul for this request
`temperature`	`number`	—	—	Sampling temperature
`max_iterations`	`number`	—	`8`	Max agent loop iterations
`stream`	`boolean`	—	`false`	Enable SSE streaming
`model_overrides`	`Record<string, string>`	—	—	Override model per mode
`metadata`	`Record<string, unknown>`	—	—	Custom metadata
`attachments`	`CompletionAttachment[]`	—	—	Image attachments for vision analysis
`image_model`	`TheoImageEngine`	—	—	Pin a Theo image sub-engine when mode resolves to `image`
`image_quality`	`TheoImageQuality`	—	—	`"draft" \| "standard" \| "hd"`
`stealth_model`	`string`	—	—	Pin a stealth model (stealth modes only)
`stealth_aspect`	`TheoStealthAspect`	—	—	Stealth image aspect ratio
`stealth_duration`	`TheoStealthDuration`	—	—	Stealth video duration
`null` is accepted in place of any optional field and treated as the field being absent (matches OpenAI / Anthropic / Stripe behavior).

CompletionResponse.usage

interface CompletionUsage {
  cost_cents: number;           // markup-inclusive
  prompt_tokens: number;        // always 0 for non-text modes
  completion_tokens: number;    // always 0 for non-text modes
  total_tokens: number;
  cached?: boolean;             // true when served from the semantic cache
}

For image / video / tts / stt modes, token counts are always 0. Use cost_cents as the sole usage metric on those modes.

`theo.stream(request)`

Returns a TheoStream — an async-iterable handle with a cancel() method plus final-metadata properties populated as events arrive.

const stream = theo.stream({ prompt: "Explain DNS" });

// Hook up a "Stop generating" button
stopButton.onclick = () => stream.cancel();

for await (const event of stream) {
  switch (event.type) {
    case "meta":     console.log("Mode:", event.data.resolved_mode); break;
    case "token":    process.stdout.write(event.token); break;
    case "tool":     console.log("Tool:", event.data.name, event.data.status); break;
    case "artifact": renderArtifact(event.data); break;
    case "error":    console.error("Stream error:", event.data.error); break;
    case "done":     console.log("\nCost:", event.data.usage.cost_cents); break;
  }
}

// Populated from meta + done events:
console.log("conversation_id:", stream.conversationId);
console.log("usage:", stream.usage);
console.log("model:", stream.model);
console.log("request_id:", stream.requestId);
console.log("full content:", stream.content); // accumulated from token events
console.log("was cancelled?", stream.isCancelled);

TheoStream

Member	Type	Description
`[Symbol.asyncIterator]()`	`AsyncIterator<StreamEvent>`	Standard async iteration — works with `for await`
`cancel()`	`void`	Abort the underlying HTTP connection; stops generation + billing
`isCancelled`	`boolean`	`true` after `cancel()` has been called
`requestId`	`string \| null`	Server request id (from `done` event or `X-Request-Id` header)
`model`	`{ id, label, engine } \| null`	Populated from the `meta` event
`resolvedMode`	`ChatMode \| null`	Mode after intent classification (from `meta`)
`conversationId`	`string \| null`	Populated from `meta` / `done` (null for stateless callers)
`usage`	`CompletionUsage \| null`	Populated from the `done` event
`content`	`string`	Accumulated text content from `token` events

StreamEvent (discriminated union)

switch (event.type) narrows event.data automatically.

`event.type`	`event.data` shape
`meta`	`StreamMetaData` — id, mode, resolved_mode, model, tools, artifacts, brand?, routing?, conversation_id, request_id
`token`	`{ token: string }` — also exposed as `event.token` for convenience
`tool`	`{ name, status, description? }`
`artifact`	`Record<string, unknown>` — shape varies by artifact type (image / video / document / code)
`skills`	`{ active: Array<{ id, slug, name, intensity? }> }`
`genui_meta`	`{ library, tools[] }` (GenUI mode only)
`done`	`StreamDoneData` — id, content, follow_ups, structured_output?, skills_active?, routing?, usage, conversation_id, request_id
`error`	`{ error: { message, type, code, request_id } }` — matches the REST error envelope
See Streaming Completions (API reference) for the full wire format and a mid-stream 429 example.

​theo.complete(request)

​CompletionRequest

​CompletionResponse.usage

​theo.stream(request)

​TheoStream

​StreamEvent (discriminated union)