Skip to main content
POST
/
api
/
v1
/
completions
Create Completion
curl --request POST \
  --url https://api.example.com/api/v1/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "prompt": "<string>",
  "mode": "<string>",
  "stream": true,
  "conversation_id": "<string>",
  "skills": [
    "<string>"
  ],
  "tools": [
    {
      "name": "<string>",
      "description": "<string>",
      "input_schema": {}
    }
  ],
  "persona": {},
  "temperature": 123,
  "max_iterations": 123,
  "model_overrides": {},
  "format": "<string>",
  "metadata": {},
  "component_library": "<string>"
}
'
{
  "id": "<string>",
  "object": "<string>",
  "created": "<string>",
  "content": "<string>",
  "mode": "<string>",
  "resolved_mode": "<string>",
  "model": {
    "id": "<string>",
    "label": "<string>",
    "engine": "<string>"
  },
  "tools_used": [
    {
      "name": "<string>",
      "status": "<string>",
      "description": "<string>"
    }
  ],
  "artifacts": [
    {}
  ],
  "follow_ups": [
    {
      "label": "<string>",
      "prompt": "<string>"
    }
  ],
  "usage": {
    "cost_cents": 123,
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  },
  "metadata": {}
}
The core endpoint of the Theo API. Sends a prompt through the full orchestration pipeline and returns the complete response.
For real-time token delivery, set stream: true or see Streaming Completions.

Authentication

Requires a Bearer token. See Authentication.
Authorization: Bearer theo_sk_...

Request Body

prompt
string
required
The prompt text. Must be a non-empty string.
mode
string
default:"auto"
Execution mode. When set to auto, Theo classifies the prompt and selects the optimal engine automatically.Available modes:
  • auto — Classify prompt and route to best engine (default)
  • fast — Low-latency responses for simple queries
  • think — Deep reasoning for complex analysis
  • code — Code generation (Theo Code engine, extended output budget)
  • image — Image generation (Theo Create)
  • video — Video generation (async)
  • research — Deep web research with citations (async)
  • roast — Humorous, irreverent tone
  • genui — Generate interactive UI components (OpenUI Lang)
stream
boolean
default:"false"
Enable SSE streaming. When true, returns a text/event-stream response instead of JSON. See Streaming.
conversation_id
string
Continue an existing conversation. Pass the conversation ID to maintain multi-turn context.
skills
string[]
Skill slugs to activate for this request. These are merged with the user’s installed skills.Each slug activates a skill’s prompt extension, tools, and model preferences for this completion. You can find slugs in the dashboard (copy icon on each skill card), via GET /api/v1/skills, or in the E.V.I. Canvas Input node.See Activating Skills via API for the full guide.
tools
object[]
Inline tool definitions the model can call during the agent loop.
persona
string | object
default:"theo"
Override Theo’s personality for this request.
  • "theo" — Default Theo persona
  • "none" — No persona (raw model output)
  • { "system_prompt": "You are..." } — Custom system prompt
temperature
number
Sampling temperature (0–2). Higher values produce more creative output.
max_iterations
integer
default:"8"
Maximum agent loop iterations (1–20). Each iteration is a think → act → observe cycle.
model_overrides
object
Override the engine used for specific modes. Keys are mode names (e.g., "code", "think"), values are Theo engine IDs (e.g., "theo-1-reason", "theo-1-flash"). See List Models for valid engine IDs.
format
string
default:"theo"
Response format. "theo" for the default format, "openai" for OpenAI-compatible format.
metadata
object
Arbitrary key-value metadata attached to the completion. Returned in the response and logged in the audit trail.
component_library
string
Component library identifier for GenUI mode. Used by E.V.I. callers for custom UI rendering.

Request Examples

curl -X POST https://hitheo.ai/api/v1/completions \
  -H "Authorization: Bearer $THEO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Explain microservices architecture",
    "mode": "auto",
    "temperature": 0.7
  }'

With Skills and Tools

curl -X POST https://hitheo.ai/api/v1/completions \
  -H "Authorization: Bearer $THEO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Check current inventory levels for SKU-1234",
    "mode": "auto",
    "skills": ["inventory-check"],
    "tools": [
      {
        "name": "check_stock",
        "description": "Look up current stock levels by SKU",
        "input_schema": {
          "type": "object",
          "properties": {
            "sku": { "type": "string" },
            "warehouse": { "type": "string" }
          },
          "required": ["sku"]
        }
      }
    ],
    "persona": { "system_prompt": "You are Atlas, an operations assistant." },
    "max_iterations": 5
  }'

Response

id
string
Unique completion ID (prefixed cmpl_).
object
string
Always "completion".
created
string
ISO 8601 timestamp.
content
string
The generated text content.
mode
string
The mode you requested (e.g., "auto").
resolved_mode
string
The mode Theo actually used after intent classification (e.g., "fast", "think", "code").
model
object
The Theo engine that handled the request.
tools_used
object[]
Tools called during the agent loop.
artifacts
object[]
Generated files (images, code, documents) produced during the completion.
follow_ups
object[]
Suggested next prompts.
usage
object
Token counts and cost.
metadata
object | null
The metadata you passed in the request, echoed back.

Example Response

{
  "id": "cmpl_abc123",
  "object": "completion",
  "created": "2026-04-10T12:00:00Z",
  "content": "Microservices architecture is a design pattern where an application is composed of small, independent services...",
  "mode": "auto",
  "resolved_mode": "fast",
  "model": {
    "id": "theo-1-flash",
    "label": "Theo Flash",
    "engine": "theo-core"
  },
  "tools_used": [],
  "artifacts": [],
  "follow_ups": [
    { "label": "Compare with monoliths", "prompt": "Compare microservices vs monolithic architecture" },
    { "label": "Service mesh", "prompt": "Explain service mesh in microservices" }
  ],
  "usage": {
    "cost_cents": 0.02,
    "prompt_tokens": 12,
    "completion_tokens": 156,
    "total_tokens": 168
  },
  "metadata": null
}

OpenAI-Compatible Format

Pass format: "openai" to receive responses in OpenAI’s chat.completions format. This allows drop-in replacement in existing OpenAI-based applications.
curl -X POST https://hitheo.ai/api/v1/completions \
  -H "Authorization: Bearer $THEO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Hello",
    "format": "openai"
  }'
The response follows the OpenAI chat.completion schema with choices, usage, and model fields.

Semantic Caching

Non-conversation completions (no conversation_id) are automatically cached. Identical requests return cached results instantly at zero cost. See Semantic Caching. Cached responses include "_cached": true in the response body.

Errors

StatusCodeDescription
400validation_errorInvalid request body (missing prompt, invalid mode, etc.)
401invalid_api_keyMissing or invalid API key
402insufficient_creditsAccount has insufficient balance
404not_foundConversation ID not found
429rate_limit_exceededToo many requests — check Retry-After header
500server_errorInternal server error