AI Best Practices Are Becoming Obsolete Faster Than They're Written

I'm a CPA who builds with AI agents every day. Not writing about them, not advising on them — building and operating production systems that do real accounting work for real clients. And I've noticed something that should make anyone following AI advice uncomfortable: the playbook I was running six months ago is wrong now. Not outdated. Wrong. The patterns I was following were actively making things worse.

This isn't a normal rate of change. In accounting, a standard that took effect in 2020 is still the standard in 2026. An ERP implementation pattern from five years ago is still defensible today. AI doesn't work like that. AI advice decays in months, and the decay is accelerating.

Here are four patterns I've watched die.

"Use LangChain to build LLM applications"

From 2023 through early 2025, this was the default recommendation in virtually every tutorial, bootcamp, and enterprise guide. LangChain defined the vocabulary the industry used to talk about building with LLMs.

Then the models caught up. OpenAI shipped native function calling, structured outputs, and the Agents SDK. Anthropic shipped tool use and prompt caching in the SDK itself. The abstractions LangChain was wrapping got absorbed by the providers. One team that documented their migration found they went from 1,200 lines of code and 14 framework dependencies down to 630 lines and 2 dependencies, with an 8-22% latency improvement and zero monthly maintenance — down from a day and a half.

The framework that saved you time when you adopted it was costing you time by the time you left. Half-life: about twelve months.

"You need RAG for anything beyond the context window"

Retrieval-Augmented Generation was the reflexive default from 2023 through 2025. If you were building an LLM application, step one was chunk your documents, embed them in a vector database, and retrieve at query time. Vector database companies raised hundreds of millions on this thesis.

Then context windows exploded. They went from 128K to 1 million tokens in a year, and Llama 4 Scout shipped a 10-million-token window. Meanwhile, the most successful AI coding agents — Cursor, Claude Code, Devin — aren't using vector databases at all. They use grep. Lexical search, file trees, and large context windows. A 200K-token query on Claude Sonnet costs about sixty cents with caching.

RAG isn't dead everywhere — enterprise deployments grew 280% in 2025. But "always use RAG" turned out to be wildly over-prescriptive. For a lot of use cases, you can just load the documents and go. Half-life: about twelve months.

"Build multi-agent systems"

This one peaked fast. By mid-2024 every conference talk and framework was pushing multi-agent architectures. AutoGen, CrewAI, LangGraph — the pitch was that the future of AI was multiple specialized agents collaborating. A "researcher" agent, a "writer" agent, a "reviewer" agent.

Then Anthropic published "Building Effective Agents" in December 2024 and quietly delivered the industry's reality check. Their core message: "Find the simplest solution possible, and only increase complexity when needed." The data backed them up. Single-agent systems covered roughly 80% of standard use cases. Multi-agent architectures degraded performance by 39-70% on sequential reasoning tasks and consumed up to 15x the tokens. Microsoft retired AutoGen to maintenance mode.

It went from "the future of AI" to "probably not what you need" in about six months.

"Always fine-tune for your domain"

Fine-tuning was the default recommendation for anything domain-specific from 2023 through early 2025. Build a training dataset, fine-tune a base model, deploy your custom version. Companies invested heavily in labeling pipelines and model training infrastructure.

Context windows killed this too. With a million tokens of context, you can load extensive domain-specific examples, documentation, and reference data directly into the prompt. No training pipeline, no custom model to maintain. And here's the part that really hurts: fine-tuned models get stranded every time a new foundation model drops. Which is now every few months. What was once a cutting-edge customized model becomes obsolete overnight when the next base model ships with better performance out of the box.

Half-life: about eight months.

The pattern

Read those four examples again, but pay attention to the timeline:

LangChain — about twelve months from best practice to liability. RAG-everything — about twelve months. Fine-tuning — about eight months. Multi-agent systems — about six months.

The half-life itself is compressing. Each generation of advice decays faster than the one it replaced. The playbook you adopted last quarter expires sooner than the playbook it replaced.

What I'm doing about it

I don't have a framework for this. I don't think one exists. The only thing I've found that works is staying close enough to the actual work that you feel when something stops working — before the blog posts catch up to tell you.

Here's one I'm acting on right now. Chat-based agent interfaces exploded over the past year — everyone shipped one. I used them extensively. I've since moved to purpose-built interfaces because the work got better. The generic chat window turns out to be a lowest-common-denominator experience that trades flexibility for depth. You can do more in a tool designed for the specific workflow than you can in a chat box that handles everything.

I could be completely wrong about this. People said the same thing at every transition I just described. But I'd rather move early and adjust than stay comfortable and fall behind.

The skill isn't knowing best practices. It's noticing when the one you're running has expired — and being willing to drop it before the consensus catches up.