Cli

arbiter --api

Run arbiter as an HTTP+SSE orchestration server. Long-running, multi-tenant, bearer-token authenticated. The on-the-wire protocol is documented in docs/api/; this page covers the CLI flags and operational behaviour.

arbiter --api [--port N] [--bind ADDR] [--verbose]

Flag	Default	What it does
`--port`	`8080`	TCP port to listen on.
`--bind`	`127.0.0.1`	Bind address. `0.0.0.0` to accept remote traffic.
`--verbose`	off	Mirror every SSE event (text deltas, tool calls, thinking) to stderr. Equivalent to `ARBITER_API_VERBOSE=1`. Verbose flag wins over the env var.

What it does on startup

Opens ~/.arbiter/tenants.db. Warns if no tenants exist (server still starts; every /v1/orchestrate returns 401 until you --add-tenant).
Reads provider API keys from env or ~/.arbiter/. See environment.md.
Resolves the admin token used to authorise tenant-management endpoints (/v1/admin/*). See ~/.arbiter/admin_token — created on first --api launch if missing.
Loads agent constitutions from ~/.arbiter/agents/*.json.
Loads the optional MCP registry from ~/.arbiter/mcp_servers.json. See docs/concepts/mcp.md.
Loads the optional A2A remote-agent registry from ~/.arbiter/a2a_agents.json. See a2a-agents.md.
Reads optional web-search provider config from ARBITER_SEARCH_PROVIDER / ARBITER_SEARCH_API_KEY (or BRAVE_SEARCH_API_KEY as a convenience fallback).
Reads optional billing-service URL from ARBITER_BILLING_URL. Unset = no eligibility checks, no usage record posting; provider keys go through directly with no caps.
Reads optional sandbox config from ARBITER_SANDBOX_IMAGE (and the rest of the ARBITER_SANDBOX_* set). When configured and usable, a per-tenant Docker sandbox is built so /exec runs inside a /workspace volume shared with /write and /read. See docs/concepts/sandbox.md. When the sandbox isn't wired or fails its usability check, /exec stays disabled — the historic SaaS default.
Clears the terminal (ANSI \033[2J\033[3J\033[H) and prints the banner + endpoint summary at row 1, so the bind address and admin-token lines anchor at the top of a clean screen instead of chasing prior shell history.
Binds the listen socket and starts accepting requests.

SIGINT / SIGTERM triggers graceful shutdown: the listen socket closes, every in-flight orchestration receives a cancel signal, and the server waits up to ARBITER_DRAIN_SECONDS (default 30) for connection threads to finish before tearing down sandbox containers and exiting. Connections still active past the deadline are abandoned with a stderr warning. See Operations → Graceful shutdown.

What it doesn't do

TLS termination. The server speaks plain HTTP. Production deployments should put TLS termination, DDoS protection, and rate limiting in a reverse proxy (nginx, caddy, cloudflare) in front of the process.
Sandboxing. By default the API server disables /exec (no shell execution from agents) and intercepts /write so file output streams to the client without landing on the server's disk. To enable /exec safely, configure the per-tenant Docker sandbox — see docs/concepts/sandbox.md. Even with the sandbox on, the arbiter daemon itself is not isolated; run the process in a container or a dedicated user account in production.
Process supervision. The binary doesn't daemonise itself. Run it under systemd / launchd / docker / pm2 — whichever your platform uses.

Endpoint reference

The HTTP surface is documented in detail in docs/api/. Quick orientation:

POST/v1/orchestrate — drives the full agentic loop, streams everything back as SSE.
GET/v1/agents, POST/v1/agents, etc. — manage the tenant-stored agent catalogue.
POST/v1/conversations, /messages — long-running conversations with persisted history.
/v1/memory/* — structured memory graph (typed nodes, relations, FTS search).
/v1/artifacts/* — files agents wrote.
/v1/a2a/agents/:id — Agent2Agent (A2A) protocol surface: per-agent cards plus JSON-RPC dispatch (message/send, message/stream, tasks/get, tasks/cancel). See docs/concepts/a2a.md.
/v1/admin/tenants/* — tenant lifecycle (admin-token gated).

Tenant identity

Every authenticated request carries Authorization: Bearer <token> where the token was issued by arbiter --add-tenant <name>. Tokens are stored hashed (SHA-256) in ~/.arbiter/tenants.db; the plaintext is shown exactly once at provisioning time. See tenants.md.

Billing

Optional. When ARBITER_BILLING_URL is set, the runtime exchanges every bearer for a workspace_id via POST/v1/runtime/auth/validate, pre-flights against POST/v1/runtime/quota/check, and posts post-turn telemetry to POST/v1/runtime/usage/record. With the env var unset, the runtime acts as a thin pass-through using the operator-supplied provider keys, with no eligibility checks. Operators wanting a commercial deployment must implement the runtime's billing protocol against a service of their own choosing — arbiter ships no reference implementation in this repository.

Logging

By default the server logs only structured events (request received, request completed, errors). Add --verbose (or ARBITER_API_VERBOSE=1) to mirror SSE events to stderr — useful when debugging an agent's tool use, noisy under normal load. Verbose mode renders one stderr line per event, colour-coded:

text deltas (line-buffered, flushed on newline)
tool_call — slash-command execution with ✓/✗ status
advisor — runtime advisor activity: advise (executor /advise consult), gate ✓ (continue), gate ↻ (redirect with guidance), gate ✗ (halt with reason), gate ⛔ (redirect budget exhausted). See docs/concepts/sse-events.md and docs/concepts/advisor.md.
escalation — out-of-band advisor halt notification (sibling of stream_end)
file — /write writes streamed to the client

Logs go to stderr; redirect with 2> in your service unit if you want them captured separately from stdout.

Resource expectations

Each in-flight /v1/orchestrate request runs its own Orchestrator instance with a per-request MCP manager. No global mutex on the request path; per-request state is isolated.
SQLite is the storage layer (~/.arbiter/tenants.db). Single-writer; safe for one process. If you need to scale to multiple processes, you'll need a different storage strategy — the schema is portable but the connection layer is currently tied to local SQLite.
Memory: scales with concurrent in-flight requests + number of distinct tenants in active use. A few hundred MB is typical for low-traffic deployments; tail latencies are dominated by upstream provider response time, not by the server itself.

Health check

curl http://127.0.0.1:8080/v1/health

Returns {"ok": true} once the server has finished startup. Useful as a Kubernetes liveness probe.