The SDK applies different timeout strategies to one-shot (unary) calls and to streams, and it retries only when doing so is safe. This page is the single source of truth for that behavior.
Defaults
| Option | Default | Applies to |
|---|
timeoutMs | 30_000 (30s) | Total duration of a unary request; also the connect/first-byte budget for a stream. |
streamIdleTimeoutMs | 120_000 (120s) | Idle (between-chunks) timeout for a stream. Not a total cap. |
maxRetries | 2 | Retry attempts for 429 / 5xx on idempotent calls. |
| Configure any of them on the client: | | |
import { Theo } from "@hitheo/sdk";
const theo = new Theo({
apiKey: process.env.THEO_API_KEY!,
timeoutMs: 60_000, // unary total-duration budget
streamIdleTimeoutMs: 180_000, // allow longer gaps between stream chunks
maxRetries: 2,
});
Unary requests
A unary call (complete, images, code, documents, job, list endpoints, …) is bounded by timeoutMs end-to-end. If it elapses, the SDK aborts the request and throws a TheoTimeoutError (error.kind === "timeout", error.timeoutMs carries the budget).
Streaming requests
stream() is not capped by a total-duration timeout — a long but healthy stream can run for minutes. Instead:
timeoutMs guards the connection / first byte. If headers never arrive, you get a TheoTimeoutError (“Stream did not start…”).
streamIdleTimeoutMs guards the gap between chunks. The timer resets on every chunk; if the stream goes silent for longer than the window, the SDK aborts it and the async iterator throws a TheoTimeoutError.
const stream = theo.stream({ prompt: "Write a long essay about TCP." });
try {
for await (const event of stream) {
if (event.type === "token") process.stdout.write(event.token);
}
} catch (err) {
// TheoTimeoutError only fires if the stream stalls for > streamIdleTimeoutMs
}
TheoStream.cancel() is different from a timeout. Cancelling ends the iterator cleanly (no throw); a timeout throws TheoTimeoutError.
Retry policy
The SDK retries automatically on:
- 429 — rate limited (honors
Retry-After up to 60s)
- 5xx — server errors (exponential backoff: 1s, 2s, 4s, capped at 8s)
It does not retry
4xx (other than 429).
Timeouts are not retried on writes
A timed-out POST (complete, research, video, …) is never retried. Re-issuing a non-idempotent generation could run — and bill — the same work twice. The timeout is terminal for those methods; only idempotent reads (GET/HEAD) retry on timeout.
If you need at-most-once semantics across your own retries, send an Idempotency-Key header (supported platform-wide) so a replay reuses the original result instead of generating again.
Per-mode guidance
fast / auto: the 30s default is plenty.
think / code: can run longer — raise timeoutMs (e.g. 60–120s) for big prompts.
research / video: do not send these to complete() / stream(). They are asynchronous and run as jobs — use theo.research() / theo.video() + theo.waitForJob(). The SDK throws a TheoUsageError immediately if you try. See Async Jobs.
Errors
All of the above surface through the SDK error hierarchy. See Error Handling for TheoTimeoutError, TheoCancelledError, TheoUsageError, and the kind discriminator.