Teach Agents How to Think, Not What to Think

The RAG instinct

When your agent gets something wrong, the first impulse is to fix it with more knowledge. Build a bigger vector store. Add more few-shot examples. Fine-tune on the edge cases. Write a longer system prompt that covers every scenario you've encountered so far. Pre-compute the answer and shove it into the context window so the model can't possibly miss it.

This is what I'd call the RAG instinct. Something went wrong, so inject more information. It's intuitive, it's fast, and it works — right up until it doesn't.

When it stops working

The problem is novelty. Your knowledge base covers the vendors you've seen. Your few-shot examples cover the transaction types you've encountered. Your carefully tuned prompt handles every edge case that's already bitten you. Then a new vendor shows up. A transaction comes through that doesn't match any pattern. An edge case falls between two of your pre-computed rules.

The agent doesn't know it's in unfamiliar territory. It was never taught to recognize that it's in unfamiliar territory. It was taught that the answer is always somewhere in the context, because that's how you built it — by making sure the answer was always in the context. So it does what any model will do when pressured to produce an answer from insufficient information: it makes one up.

This is the failure mode that kills you. Not the agent getting a known case wrong — you catch those in testing. It's the agent confidently handling an unknown case as though it were known. By the time you find it, the journal entry is posted, the period is closed, and you're doing correcting entries.

The Bitter Lesson

Richard Sutton wrote an essay called "The Bitter Lesson" that makes a point AI practitioners keep having to relearn: across the history of AI, general methods that leverage computation consistently beat approaches built on hand-coded human knowledge. The systems that scale are the ones that search and learn, not the ones where researchers tried to bake in the right answers upfront.

The same principle applies to agent design. Every hour you spend encoding specific answers into your agent's context is an hour you could have spent giving it better tools to find answers on its own. The knowledge base approach is hand-coding. The tool-based approach is the general method.

Don't encode answers. Encode the ability to search for answers.

Give the agent tools to look up historical precedent. Let it query reference data. Let it search for similar past decisions and see how they were resolved. Instead of a static mapping from input to output, give it the means to construct the mapping at runtime from actual evidence.

The management parallel

If you've managed people, you already understand this. There are two kinds of managers.

The first kind gives their team the answer every time. Someone comes to them with a question, they give the answer. It's fast. It feels productive. The manager looks indispensable. But the team never develops judgment. Every new situation requires another trip to the manager's desk. The team's capacity is permanently bottlenecked by one person's availability.

The second kind teaches their team how to find the answer. Where to look. What questions to ask. How to evaluate options. What "good" looks like. This is slower on day one. It's frustrating for a week. But by day thirty, the team is handling situations the manager never anticipated, because they were taught the method, not the output.

The first manager built a lookup table. The second manager built a search engine. The lookup table breaks on novel input. The search engine handles it.

We're making the same choice every time we design an agent. Are we giving it answers, or are we giving it the ability to find answers?

What this looks like in practice

Here's a concrete example. You have an agent that codes invoices to GL accounts. The straightforward approach is a mapping table: vendor X goes to account Y. You've seen a hundred invoices from this vendor, you know the coding, you hard-wire it.

Works great until vendor 101 shows up. Or until vendor X starts sending invoices for a new product category that should go to a different account. The mapping table has no concept of "this doesn't fit." It just picks the closest match and moves on.

Now give the agent a different tool: the ability to search for "how was this vendor coded last time?" and "what account do similar expenses typically go to?" and "has this vendor's coding ever changed?" The agent that can search handles novelty because it can distinguish between confidence and uncertainty. When the search returns strong matches, it codes with confidence. When the search returns nothing, it knows it's in new territory and can escalate.

The mapping table is faster on the first invoice. The search tool is faster on the thousandth, because it hasn't broken fifty times in between.

An agent with tools to find answers will outperform an agent with pre-loaded answers — not because it's smarter, but because it knows when it doesn't know.

Self-sufficiency is the goal

The measure of a good agent is the same as the measure of a good employee: how many situations can it handle without escalating? You don't get there by anticipating every possible situation and pre-loading the response. That's not scalable, and it's not how competence works.

You get there by teaching methods. How to look things up. How to evaluate whether a match is good enough. How to recognize when you're out of your depth. How to ask for help in a way that's useful rather than just punting the problem back.

Stuff the context window with answers and you get an agent that's brittle. Stuff it with methods for finding answers and you get an agent that's resilient. The difference is the same difference between memorization and understanding, between a lookup table and a reasoning engine, between a team that depends on you and a team that can operate without you.

Teach how to think. Not what to think.

* * *

This is the third in a series on how practical experience applies to AI agent architecture. Previously: what 500 years of accounting can teach AI safety, and why you shouldn't build the agent.