stream: true on any completion request to receive events as they happen instead of waiting for the full response.
SSE is the same protocol OpenAI and Anthropic use — any HTTP client that can parse text/event-stream works. The SDK wraps it into an async iterable.
Why streaming?
- Time-to-first-token under 300ms on fast modes — users see output immediately instead of staring at a spinner.
- Agent-loop transparency —
toolandartifactevents fire the moment Theo uses a tool or produces a file, so you can render a live “Theo is browsing…” or “Generated image” card. - Graceful cancellation — close the HTTP connection (or call
stream.cancel()in the SDK) to stop billing for tokens you don’t want.
Event flow
A normal turn emits events in this order:thinking— a heartbeat so proxies flush the response.meta— resolved mode, branded model, routing telemetry,conversation_id,request_id.skills(optional) — which skills fired for this turn.genui_meta(optional) — only whenresolved_mode === "genui".tool/artifact— as they happen (may interleave).token— one per text chunk.done— full content, follow-ups, usage,conversation_id,request_id. If anything fails, the server emits anerrorevent whose payload matches the REST error envelope ({ error: { message, type, code, request_id } }) and closes the stream. No special handling required — reuse your HTTP error code path. See Streaming Completions (API reference) for the per-event JSON schemas, wire-format rules, and a mid-stream 429 example.
SDK Streaming
The SDK returns aTheoStream. It is async-iterable, so for await works out of the box. After the stream completes, it exposes conversationId, usage, model, and requestId as properties.
Cancelling mid-generation
TheoStream.cancel() aborts the underlying HTTP connection, so the server stops generating and billing stops. The iterator ends after the next event boundary.
Raw SSE (curl / fetch)
Use the canonicalwww host to avoid an apex-to-www redirect that some HTTP clients handle by stripping the Authorization header (see 401 Troubleshooting):
Multi-turn conversations
Theconversation_id appears on both the meta and done events. When you pass one in the request, it is echoed back unchanged; when you don’t, it is null (the server doesn’t create a persistent conversation for API-key callers unless you explicitly call POST /api/v1/conversations).
The SDK captures the ID automatically — read stream.conversationId after the stream completes.
E.V.I. Streaming
E.V.I. instances (theo.evi(...)) support streaming with the same TheoStream surface:
