Integrations
This chapter covers the control socket (Unix domain socket) and web gateway (WebSocket) integration points. For the MCP server interface, see MCP Server. For the presence layer that mediates user interaction, see Presence Layer.
Control Socket
When --control-socket is enabled, a Unix domain socket is created at /tmp/intendant-<pid>.sock. This enables programmatic control of a running Intendant instance from external scripts and tools.
- Outbound event broadcast to all connected clients
- Inbound command handling for status, approval, denial, human input, autonomy change, quit, controller-restart workflow commands, and controller-loop intervention commands (in MCP mode)
- Socket server is opt-in via
--control-socket
Inbound Commands (JSON-line)
{"action": "status"}
{"action": "approve", "id": 123}
{"action": "deny", "id": 123}
{"action": "input", "text": "answer to askHuman"}
{"action": "set_autonomy", "level": "high"}
{"action": "schedule_controller_restart", "controller_id":"codex", "north_star_goal":"audit and improve", "restart_after":"turn_end"}
{"action": "controller_turn_complete", "restart_id":"<id>", "turn_complete_token":"<token>", "status":"ok", "handoff_summary":"..."}
{"action": "get_restart_status"}
{"action": "cancel_controller_restart", "restart_id":"<id>"}
{"action": "request_controller_loop_halt", "persistent": true}
{"action": "clear_controller_loop_halt"}
{"action": "intervene_controller_loop", "mode":"stop"}
{"action": "get_controller_loop_status"}
{"action": "query_detail", "scope": "diff"}
{"action": "query_detail", "scope": "file", "target": "src/main.rs"}
{"action": "recall_memory", "keywords": ["auth", "login"], "channel": "project_state"}
{"action": "usage"}
{"action": "quit"}
Outbound Events (streamed to connected clients)
{"event": "turn_started", "turn": 5, "budget_pct": 12.3}
{"event": "agent_output", "stdout": "...", "stderr": "..."}
{"event": "approval_required", "id": 123, "command": "rm -rf /tmp/test"}
{"event": "ask_human", "question": "Which database?"}
{"event": "task_complete", "reason": "done signal"}
{"event": "status", "turn": 3, "phase": "thinking", "autonomy": "medium", "session_id": "abc-123", "task": "fix tests"}
{"event": "usage", "main": {"provider": "openai", "model": "gpt-5", "tokens_used": 12000, "context_window": 128000, "usage_pct": 9.4}}
{"event": "usage_update", "main": {"provider": "openai", "model": "gpt-5", "tokens_used": 15000, "context_window": 128000, "usage_pct": 11.7}}
{"event": "command_result", "action": "get_restart_status", "ok": true, "message": "ok", "data": {...}}
- The
statusevent now includessession_idandtaskfields. - The
usageevent is a response to{"action": "usage"}, returning per-model token usage. - The
usage_updateevent is broadcast automatically after each agent turn, providing streaming token consumption updates. Thepresencefield is included when the presence layer is active.
command_result.ok is false when a control action fails (for example, schedule_controller_restart with restart_after="now" and no executable restart action configured).
Example Usage
echo '{"action":"status"}' | socat - UNIX:/tmp/intendant-$(pgrep intendant).sock
Web Gateway
The --web flag starts a web server that serves the app dashboard and bridges WebSocket connections to the EventBus. --web implies --mcp, so no initial task is required — the agent starts idle and accepts tasks dynamically.
See Web Dashboard for the full dashboard documentation and Presence Layer for details on the presence session protocol and mutual exclusion.
How It Works
Browser ──WebSocket──> Intendant web gateway (port 8765)
│ │
│ Terminal I/O (ANSI) │ Events (broadcast to all clients)
│ Key/resize input │ Tool responses (per-connection direct channel)
│ Tool requests │ State snapshot + log replay (on connect)
│ presence_connect/disconnect │ Presence welcome (on voice connect)
│ Voice logs/checkpoints │ Per-connection TUI frames
│ Audio for transcription │
v v
App dashboard (WASM) EventBus + AgentStateSnapshot
+ │
Optional: browser-side │ Dual outbound channels:
live model (Gemini/OpenAI) │ - broadcast::Receiver (events)
│ │ - mpsc::unbounded (direct responses)
│ (function calls → tool_request)
v
Intendant agent loop
The web gateway has three layers:
-
App dashboard — The primary web interface at
/with 4 tabs (Activity, Usage, Terminal, Displays). State management is handled bypresence-webWASM. Events are broadcast and late-connecting browsers get a full log replay. -
Per-connection TUI rendering — Each WebSocket connection gets its own
WebTuiinstance with independent terminal dimensions. ANSI output is sent per-connection via the direct channel, not broadcast. -
Presence bridge (optional) — When a browser connects a live model (Gemini Live / OpenAI Realtime), the model uses 9 presence tools that map to
tool_requestWebSocket messages. The gateway handles these server-side and returnstool_responsemessages on the per-connection direct channel.
WebSocket Protocol
Inbound Messages (browser → server)
| Message | Description |
|---|---|
{"t":"key","key":"..."} | Keyboard input (routed to per-connection WebTui) |
{"t":"resize","cols":N,"rows":N} | Terminal resize (per-connection) |
{"t":"presence_connect",...} | Presence session protocol — replaces server-side presence |
{"t":"presence_disconnect"} | Disconnect presence — resumes server-side presence |
{"t":"make_active"} | Request active voice ownership (handover) |
{"t":"voice_log","text":"...","seq":N} | Voice transcript from browser presence model |
{"t":"presence_checkpoint","summary":"...","last_event_seq":N} | Context checkpoint |
{"t":"voice_diagnostic","kind":"...","detail":"..."} | Browser voice diagnostics |
{"t":"user_audio","data":"<base64>"} | PCM16 audio for server-side transcription |
{"t":"tool_request","id":"...","tool":"...","args":{}} | Presence tool call |
{"t":"async_query","id":"...","tool":"...","args":{}} | Async query (result as text, not tool response) |
{"action":"..."} | ControlMsg (same as Unix control socket) |
{"t":"live_connected"} / {"t":"live_disconnected"} | Legacy (still accepted) |
Outbound Messages (server → browser)
| Message | Description |
|---|---|
{"t":"term","d":"<base64>"} | Per-connection TUI ANSI output |
{"t":"state_snapshot","state":{...},"connection_id":"...","config":{...},"session_id":"..."} | Bootstrap on connect |
{"t":"log_replay","entries":[...]} | Historical session events for late-connecting browsers |
{"t":"presence_welcome","session_id":"...","state":{...},"events":[...],"is_active":bool,"conversation_context":"..."} | Presence session welcome |
{"t":"active_granted","is_active":true,"handover_context":"...","conversation_context":"..."} | Active ownership granted |
{"t":"force_disconnect_voice","reason":"handover"} | Sent to old active on handover |
{"t":"presence_checkpoint_ack","seq":N} | Checkpoint acknowledgement |
{"t":"tool_response","id":"...","result":"..."} | Response to a tool_request |
{"t":"async_query_result","id":"...","tool":"...","result":"..."} | Response to async_query |
{"event":"..."} | OutboundEvent broadcast (status, agent_output, approval_required, etc.) |
Tool Request/Response Protocol
The browser live model calls presence tools via tagged request/response messages:
// Browser sends:
{"t":"tool_request","id":"req-42","tool":"check_status","args":{}}
// Server responds (on direct channel):
{"t":"tool_response","id":"req-42","result":"Phase: Running agent (turn 5). Budget: 23% used."}
Action tools (submit_task, approve_action, deny_action, skip_action, respond_to_question, set_autonomy) are dispatched via the EventBus — the same path as TUI key presses and control socket commands.
Query tools (check_status, query_detail, recall_memory) are handled asynchronously server-side via presence::handle_tool_query(), which reads from the shared AgentStateSnapshot, project files, and knowledge store.
State Bootstrap
On WebSocket connect, the server sends multiple bootstrap messages:
state_snapshot— FullAgentStateSnapshotwithconnection_id, config, andsession_id- Cached
usage_update— Latest token usage data - Cached
status— Latest status (autonomy, session_id, task) - Cached
display_ready— Latest display info for VNC slots log_replay— Historical session events parsed fromsession.jsonl
This ensures late-connecting browsers see the complete state immediately.
HTTP Endpoints
| Endpoint | Description |
|---|---|
GET / | App dashboard (4-tab UI: Activity, Usage, Terminal, Displays) |
GET /config | Live model configuration JSON |
GET /debug | Debug JSON (agent state, voice connection, active browser) |
POST /session | Mint ephemeral session tokens for Gemini Live / OpenAI Realtime |
GET /wasm-web/* | WASM and JS glue (content-hash cache-busted) |
GET /audio-processor.js | AudioWorklet processor for microphone capture |
WS / | Main WebSocket (events, terminal I/O, presence protocol) |
WS /vnc | WebSocket-to-TCP VNC proxy for noVNC display viewing |
Requirements
- Microphone access requires a secure context: Use
localhost(via SSH tunnel:ssh -L 8765:localhost:8765 host), or set browser flags for insecure origins. - API key for voice: Gemini or OpenAI. The key is used browser-side only. Voice is optional — the dashboard works without it.
Supported Tools (Browser Live Model)
| Tool | Type | Description |
|---|---|---|
submit_task | Action | Submit a new task to the agent loop |
approve_action | Action | Approve a pending action |
deny_action | Action | Deny a pending action |
skip_action | Action | Skip a pending action |
respond_to_question | Action | Answer an askHuman question |
set_autonomy | Action | Change autonomy level |
check_status | Query | Get current agent phase, turn, budget |
query_detail | Query | Get git diff, file contents, or log details |
recall_memory | Query | Search the knowledge store by keywords/channel |