What 500 Years of Accounting Can Teach AI Safety

Here's how most people think about AI safety: give the agent permissions. Put it in a sandbox. Define what it can and can't access. If something goes wrong, tighten the permissions.

This is the IT security model. It asks one question: is this agent authorized to do this thing?

It's a reasonable starting point. It's also completely insufficient for autonomous systems — and if you've ever worked in accounting, you already know why.

The question accounting asks instead

Accounting is built on a different assumption. Not "is this person authorized?" but "even if authorized, should they act alone?"

This is separation of duties. It's not a best practice or a nice-to-have. It's a structural requirement that dates back to double-entry bookkeeping in 15th century Florence. The person who enters a transaction cannot be the same person who approves it. The person who approves it cannot be the same person who pays it. The person who reconciles the bank cannot be the same person who wrote the checks.

This isn't because accountants are paranoid. It's because accounting has always operated under a specific assumption that IT security doesn't make: internal users are potential bad actors.

Think about that for a second. IT security asks whether you should be inside the building. Accounting assumes you're already inside and asks whether you should be acting alone once you're there. These are fundamentally different threat models.

Why this matters for AI agents

Most AI safety work is built on the IT model. Can this agent access this API? Is it sandboxed? What are its permissions? These are real questions, and they matter. But they only address one failure mode: an agent doing something it wasn't supposed to do at all.

They don't address the harder failure mode: an agent doing something it was supposed to do, but doing it wrong.

An agent that has permission to post journal entries can still post the wrong journal entry. An agent authorized to process invoices can still code an expense to the wrong account. An agent with access to your bank feed can still misclassify a transaction. None of these are permission failures. They're judgment failures. And permissions don't catch judgment failures.

Separation of duties does.

How to apply it

The mechanism is context isolation. Not just as a hygiene practice — as a structural requirement for adversarial review.

When two agents share the same context, they share the same blind spots. If Agent A processed an invoice and built up a context window full of that vendor's history, coding patterns, and prior transactions, then asking Agent A to review its own work is theater. It will confirm its own reasoning because it's reasoning from the same information.

But if you spin up Agent B with different context — maybe just the GL account rules, the invoice image, and the posting that Agent A produced — now you have a genuine second opinion. Agent B isn't smarter. It's differently informed. It will catch things Agent A can't see precisely because it doesn't share Agent A's context.

This is the same principle that makes an auditor effective. The auditor doesn't know your business better than you do. They catch things because they're looking at your work without your assumptions baked in.

The coordination problem

People have tried multi-agent review before. The usual approach is horizontal: two peer agents checking each other's work. This fails in predictable ways.

If both agents have equal authority, you get one of two outcomes. Either one agent declares "good enough" before the other agrees — premature exit. Or neither has the authority to stop, so they iterate forever — infinite loops. I've seen both. They're not edge cases. They're the default behavior of horizontal peer review between autonomous agents.

The fix is to turn it vertical. One agent is the executor — it has tools, it does the work. The other is the coordinator — it has no tools, it can only read and delegate. The coordinator originated the task, so it holds the ground truth of intent. The executor can't declare itself done. Only the coordinator can.

This isn't a novel hierarchy. It's how every well-run accounting department already works. Staff does the entries. Manager reviews and approves. The person doing the work never gets to sign off on their own output.

The coordinator — an agent that can only read and trigger other agents — is the structural replacement for a human reviewer. It evaluates. It doesn't execute. It owns "done."

What the industry is getting wrong

Right now, the AI industry is approaching safety through the IT lens. Permissions. Sandboxes. Guardrails-as-constraints. These matter, and I use all of them. But they're table stakes, not the solution.

The accounting lens offers something different: structural impossibility of unilateral action. Not "this agent isn't allowed to do that" but "no single agent can complete this process alone, regardless of what it's allowed to do."

The accounting profession figured this out five centuries ago. Not because accountants are smarter than security engineers — because they started from a harder assumption. They assumed the people inside the system couldn't be fully trusted, and they built structures that made trust unnecessary.

That's the model AI agents need. Not better permissions. Better structure.

* * *

This is the first in a series on how accounting principles apply to AI agent architecture. Next: why we chose not to build our own AI agent — and what we built instead.