All posts
2 min read

Why we built a runtime for supervised agents

  • philosophy
  • runtime
  • supervision

An agent that can call tools is easy to build. An agent you can trust to call tools while you're not watching is not. The gap between those two things is where most "agent" projects quietly stall — not because the model isn't capable, but because nothing in the system is accountable for what the model decides to do.

The missing layer

The common shape of an agent framework is: a model, a tool schema, a loop. The model proposes a tool call, the loop executes it, the result goes back in. Repeat until the model says it's done.

Nothing in that loop is positioned to say no. The model is both the actor and the only judge of its own actions. When it goes off the rails — loops on a failing call, fabricates a result, declares success on a half-finished task — there's no structural check, only the hope that the next token is better than the last.

Arbiter adds that missing layer: an advisor model, structurally separate from the executor, that the executor cannot bypass. Before a terminating turn ships, the advisor sees it and returns one of three verdicts:

  • CONTINUE — the turn is sound, let it through.
  • REDIRECT — the turn is off-course; here's a correction, try again.
  • HALT — stop; this isn't recoverable without a human.

The executor doesn't get to ignore this. The gate is enforced by the runtime, not by a prompt instruction the model might wander away from.

Supervision is a runtime concern

You could try to bolt this on at the application layer — wrap every agent call, diff the output, run a second model, decide what to do. People do. It's brittle, it's duplicated per integration, and it's only as good as the wrapper.

Putting it in the runtime means it's there for every surface equally: the HTTP API, the interactive terminal, a one-shot shell invocation. Same agents, same memory, same gate. The supervision isn't a feature of one client — it's a property of the system.

That's the bet behind Arbiter: the interesting problems in agentic software aren't in the model call itself. They're in everything around it — memory that persists and ranks, execution that streams and resumes, and a supervisor that's allowed to say no.