Web Dashboard
The --web flag starts a web server that serves a modern dashboard for monitoring and interacting with Intendant remotely. The dashboard runs entirely in the browser with WASM-powered state management.
Running
# Default port 8765
./target/release/intendant --web
# Custom port
./target/release/intendant --web 9000
Open http://<host>:8765/ in a browser. The --web flag implies --mcp, so no initial task is required — the agent starts idle and accepts tasks dynamically.
Dashboard Tabs
Activity
A scrollable, color-coded event log showing everything happening in the system:
- system — session lifecycle, approvals, context management
- worker — model responses, reasoning summaries, task completion
- agent — command execution output (stdout/stderr, exit codes)
- live — voice transcripts, presence lifecycle, tool requests
- server — presence model internals (thinking, tool calls)
Events are grouped by turn with visual separators. New events while viewing other tabs trigger a notification badge. Late-connecting browsers receive a full replay of historical events from session.jsonl.
Usage
Token consumption for the main model and presence model:
- Prompt, completion, and cached token breakdowns
- Cost estimates using a built-in pricing table (OpenAI, Anthropic, Gemini models)
- Usage history over time
- Updated after each agent turn via
usage_updateevents
Terminal
An embedded xterm.js terminal connected to the server-side ratatui TUI. Each browser connection gets its own independent terminal rendering with separate dimensions. This shows the same interface as the native terminal TUI — status bar, log panel, action panel, approval/input panels.
Key presses and terminal resizes in the browser are sent to the server and rendered independently per connection.
Displays
Remote viewing of Xvfb displays created by the agent. When the agent runs graphical applications (via execAsAgent with a DISPLAY), the display appears here as a noVNC viewer.
Displays are created lazily — the tab populates automatically when the agent’s first command triggers Xvfb auto-launch. Each display shows the VNC port for direct connection too.
Live Voice
The dashboard supports optional live voice interaction via Gemini Live or OpenAI Realtime. When activated:
- The browser connects directly to the model’s realtime API for low-latency voice I/O
- The live model receives agent events and narrates progress
- Tool calls from the live model (
submit_task,approve_action,check_status, etc.) are routed through the WebSocket to the server - Server-side presence is automatically paused (mutual exclusion)
Setup
- Enter your API key on first visit (Gemini or OpenAI)
- Keys are stored in browser localStorage — never sent to the Intendant server
- Click the microphone button to connect
Active/Passive Browsers
Only one browser can be “active” (controlling the voice model) at a time:
- First browser to connect voice becomes active
- Additional browsers are passive observers (receive events and TUI frames, but don’t pause server-side presence)
- A passive browser can request active status via the UI, which force-disconnects the previous active browser
- Active handover includes the last checkpoint summary and conversation context
Session Continuity
The presence session protocol maintains context across reconnects:
- On connect, the server sends a
presence_welcomewith current state, missed events, and conversation context - The browser sends periodic
presence_checkpointmessages with a summary of the conversation - On reconnect, the server replays events since the last checkpoint
- This prevents the voice model from losing context when the browser refreshes or the connection drops
Server-Side Transcription
When [transcription] is enabled in intendant.toml, the browser sends microphone audio to the server for transcription via the Whisper API:
[transcription]
enabled = true
provider = "openai"
model = "whisper-1"
language = "en"
Audio is buffered in ~3s chunks, filtered by RMS energy to skip silence, and sent to the transcription endpoint. Transcripts are broadcast as user_transcript events and logged to the session.
Configuration
The web gateway configuration is controlled by [presence] settings in intendant.toml:
[presence]
live_provider = "gemini" # voice model provider
live_model = "gemini-2.5-flash-native-audio-preview-12-2025" # voice model
Or via environment variables:
GEMINI_API_KEY/OPENAI_API_KEY— for ephemeral token minting (POST /session)
The /config endpoint returns the configured provider, model, and sample rates as JSON.
HTTP Endpoints
| Endpoint | Description |
|---|---|
GET / | Web app dashboard (4-tab UI) |
GET /config | Live model configuration JSON |
GET /debug | Debug JSON (agent state, voice connection, active browser) |
POST /session | Mint ephemeral session tokens for Gemini Live / OpenAI Realtime |
GET /wasm-web/* | WASM and JS glue (content-hash cache-busted) |
GET /audio-processor.js | AudioWorklet processor for microphone capture |
WS / | Main WebSocket (events, terminal I/O, presence protocol) |
WS /vnc | WebSocket-to-TCP VNC proxy (for noVNC) |
Requirements
- Microphone access requires a secure context: Use
localhost(via SSH tunnel:ssh -L 8765:localhost:8765 host), or set browser flags for insecure origins - API key for voice: Gemini or OpenAI (stored browser-side only). Voice is optional — the dashboard works without it
- WASM: The dashboard uses a compiled WASM module (
presence-webcrate). Rebuild withwasm-pack build --target webfromcrates/presence-web/if you modify the Rust code