arbiter --api
Run arbiter as an HTTP+SSE orchestration server. Long-running, multi-tenant, bearer-token authenticated. The on-the-wire protocol is documented in docs/api/; this page covers the CLI flags and operational behaviour.
arbiter --api [--port N] [--bind ADDR] [--verbose]
| Flag | Default | What it does |
|---|---|---|
--port | 8080 | TCP port to listen on. |
--bind | 127.0.0.1 | Bind address. 0.0.0.0 to accept remote traffic. |
--verbose | off | Mirror every SSE event (text deltas, tool calls, thinking) to stderr. Equivalent to ARBITER_API_VERBOSE=1. Verbose flag wins over the env var. |
What it does on startup
- Opens
~/.arbiter/tenants.db. Warns if no tenants exist (server still starts; every/v1/orchestratereturns 401 until you--add-tenant). - Reads provider API keys from env or
~/.arbiter/. See environment.md. - Resolves the admin token used to authorise tenant-management endpoints (
/v1/admin/*). See~/.arbiter/admin_token— created on first--apilaunch if missing. - Loads agent constitutions from
~/.arbiter/agents/*.json. - Loads the optional MCP registry from
~/.arbiter/mcp_servers.json. Seedocs/concepts/mcp.md. - Loads the optional A2A remote-agent registry from
~/.arbiter/a2a_agents.json. See a2a-agents.md. - Reads optional web-search provider config from
ARBITER_SEARCH_PROVIDER/ARBITER_SEARCH_API_KEY(orBRAVE_SEARCH_API_KEYas a convenience fallback). - Reads optional billing-service URL from
ARBITER_BILLING_URL. Unset = no eligibility checks, no usage record posting; provider keys go through directly with no caps. - Reads optional sandbox config from
ARBITER_SANDBOX_IMAGE(and the rest of theARBITER_SANDBOX_*set). When configured and usable, a per-tenant Docker sandbox is built so/execruns inside a/workspacevolume shared with/writeand/read. Seedocs/concepts/sandbox.md. When the sandbox isn't wired or fails its usability check,/execstays disabled — the historic SaaS default. - Clears the terminal (ANSI
\033[2J\033[3J\033[H) and prints the banner + endpoint summary at row 1, so the bind address and admin-token lines anchor at the top of a clean screen instead of chasing prior shell history. - Binds the listen socket and starts accepting requests.
SIGINT / SIGTERM triggers graceful shutdown: the listen socket closes, every in-flight orchestration receives a cancel signal, and the server waits up to ARBITER_DRAIN_SECONDS (default 30) for connection threads to finish before tearing down sandbox containers and exiting. Connections still active past the deadline are abandoned with a stderr warning. See Operations → Graceful shutdown.
What it doesn't do
- TLS termination. The server speaks plain HTTP. Production deployments should put TLS termination, DDoS protection, and rate limiting in a reverse proxy (nginx, caddy, cloudflare) in front of the process.
- Sandboxing. By default the API server disables
/exec(no shell execution from agents) and intercepts/writeso file output streams to the client without landing on the server's disk. To enable/execsafely, configure the per-tenant Docker sandbox — seedocs/concepts/sandbox.md. Even with the sandbox on, the arbiter daemon itself is not isolated; run the process in a container or a dedicated user account in production. - Process supervision. The binary doesn't daemonise itself. Run it under systemd / launchd / docker / pm2 — whichever your platform uses.
Endpoint reference
The HTTP surface is documented in detail in docs/api/. Quick orientation:
POST/v1/orchestrate— drives the full agentic loop, streams everything back as SSE.GET/v1/agents,POST/v1/agents, etc. — manage the tenant-stored agent catalogue.POST/v1/conversations,/messages— long-running conversations with persisted history./v1/memory/*— structured memory graph (typed nodes, relations, FTS search)./v1/artifacts/*— files agents wrote./v1/a2a/agents/:id— Agent2Agent (A2A) protocol surface: per-agent cards plus JSON-RPC dispatch (message/send,message/stream,tasks/get,tasks/cancel). Seedocs/concepts/a2a.md./v1/admin/tenants/*— tenant lifecycle (admin-token gated).
Tenant identity
Every authenticated request carries Authorization: Bearer <token> where the token was issued by arbiter --add-tenant <name>. Tokens are stored hashed (SHA-256) in ~/.arbiter/tenants.db; the plaintext is shown exactly once at provisioning time. See tenants.md.
Billing
Optional. When ARBITER_BILLING_URL is set, the runtime exchanges every bearer for a workspace_id via POST/v1/runtime/auth/validate, pre-flights against POST/v1/runtime/quota/check, and posts post-turn telemetry to POST/v1/runtime/usage/record. With the env var unset, the runtime acts as a thin pass-through using the operator-supplied provider keys, with no eligibility checks. Operators wanting a commercial deployment must implement the runtime's billing protocol against a service of their own choosing — arbiter ships no reference implementation in this repository.
Logging
By default the server logs only structured events (request received, request completed, errors). Add --verbose (or ARBITER_API_VERBOSE=1) to mirror SSE events to stderr — useful when debugging an agent's tool use, noisy under normal load. Verbose mode renders one stderr line per event, colour-coded:
- text deltas (line-buffered, flushed on newline)
tool_call— slash-command execution with ✓/✗ statusadvisor— runtime advisor activity:advise(executor/adviseconsult),gate ✓(continue),gate ↻(redirect with guidance),gate ✗(halt with reason),gate ⛔(redirect budget exhausted). Seedocs/concepts/sse-events.mdanddocs/concepts/advisor.md.escalation— out-of-band advisor halt notification (sibling ofstream_end)file—/writewrites streamed to the client
Logs go to stderr; redirect with 2> in your service unit if you want them captured separately from stdout.
Resource expectations
- Each in-flight
/v1/orchestraterequest runs its own Orchestrator instance with a per-request MCP manager. No global mutex on the request path; per-request state is isolated. - SQLite is the storage layer (
~/.arbiter/tenants.db). Single-writer; safe for one process. If you need to scale to multiple processes, you'll need a different storage strategy — the schema is portable but the connection layer is currently tied to local SQLite. - Memory: scales with concurrent in-flight requests + number of distinct tenants in active use. A few hundred MB is typical for low-traffic deployments; tail latencies are dominated by upstream provider response time, not by the server itself.
Health check
curl http://127.0.0.1:8080/v1/health
Returns {"ok": true} once the server has finished startup. Useful as a Kubernetes liveness probe.
See also
docs/api/index.md— endpoint reference.docs/concepts/authentication.md— bearer-token semantics.docs/concepts/operations.md— operational notes.- tenants.md — provisioning the bearer tokens this server validates.