Handling Repeated Mistakes with AI
How I use a mistakes.md file to help AI agents learn from their errors and avoid repeating them.
How I use a mistakes.md file to help AI agents learn from their errors and avoid repeating them.
When an agent keeps picking the wrong macOS speech API or the same hallucinated method, I drop a mistakes.md next to that code and tell Claude or Cursor to read it before planning and append to it after every repeat failure. The same wrong turn twice becomes a note that follows me repo to repo.
Agents fix plenty on their own. Build fails, tests go red, they adjust. The annoying case is the loop. They're confident, you're not, and you're pasting the same Apple doc link for the third time because they slipped back into last year's API surface.
I used to stuff those corrections into claude.md. That works until the fix only matters in one folder, or until your global rules file reads like a novel and the agent ignores half of it anyway.
When a mistake is local (this package, this integration), I add mistakes.md in that directory. Not a project-wide manifesto. Just what we tried, what was wrong, what actually worked.
Example lines:
event.id, not payment_intent.id."Short. Blunt. Written for the agent, not for a README audience.
In claude.md (or whatever rules file the tool respects), I add something like: before you write code in a directory, read mistakes.md if it exists.
That one line changed more sessions than yelling "READ THE DOC I JUST SENT." Planning phase is when the model still has attention budget. By the time it's three files deep into a refactor, my correction is ancient history in the thread.
If it makes the same class of error after I've corrected it, I don't only complain in chat. I add a row to mistakes.md. Chat evaporates. The file stays.
Over a few weeks the repo collects scar tissue. New mistakes still happen. They're usually new mistakes, which is the whole point.
Starting a greenfield app, I'll sometimes point an agent at mistakes.md files from an older project: "read these before you touch speech code."
That's how I got through another round of speech-to-text work. macOS has had a "speech recognition" API for years. It's fine. macOS 26 ships a much better transcription API that shows up second in every search the model runs unless you already told it not to pick the old one.
I'd documented that trap in a previous repo's mistakes.md. On the new app, Cursor Auto one-shot the right implementation because the note was in context before the first line of code.
Without that file I was looking at another hour of "no, not that framework."
Still not perfect. Agents drift. I'd rather maintain a pile of small, honest mistakes.md files than explain the same hallucination four times in one afternoon.