The Edit Trick: Efficient LLM Annotation of Documents
TL;DR: The “edit trick”, like many good ideas, is simultaneously obvious and not. When annotating documents using LLMs, don’t send the whole document in and generate a modified version. Rather, generate a list of edits that are applied to the original document. It’s much faster and cheaper.
“College of Sir Isaac Newton, renowned inventor of the milled-edge coin and the catflap!”
“The what?” said Richard.
“The catflap! A device of the utmost cunning, perspicuity and invention.”
“Yes,” said Richard, “there was also the small matter of gravity.”
“Gravity was merely a discovery. It was there to be discovered… But the catflap… ah, that is invention, pure creative invention.”
“I would have thought it was quite obvious. Anyone could have thought of it.”
“Ah, it is a rare mind indeed that can render the hitherto nonexistent blindingly obvious.”— From Dirk Gently’s Holistic Detective Agency
Like Newton’s catflap, the edit trick is one of those inventions that seems blindingly obvious once someone points it out. Yet it represents a fundamental shift in how we think about document processing with LLMs.
The Problem: Paying for Expensive Copy-Paste
I’ve been deep in the world of AI-assisted development lately, and one pattern has become strikingly clear: when you’re processing documents with an LLM, the obvious approach is painfully inefficient.
Ask yourself: what happens when you want Claude to add section headings to a 5,000-word article? The obvious approach involves sending the entire document and asking for the whole thing back with headings added. But that’s not the best way to do it. When you think of it, 95% of what you’re doing is paying for the most expensive copy paste you can imagine.
The issue boils down to three critical bottlenecks:
- Token usage (and therefore cost) skyrockets
- Processing time drags on unnecessarily
- Context window limitations become a real headache. Especially because while many LLMs have large input context windows, there are often lower and harder limits on the output.
The Edit Trick: A Better Way
I decided to build a small project to demonstrate a much better approach I’ve used a few times now. I didn’t invent this, I reverse-engineered it from watching Claude Code, but realized it had much broader applicability. The concept is simple yet powerful: why have the LLM regurgitate the entire document when all you need are the edits?
Here’s how the “edit trick” works:
- Send your document to the LLM
- Instead of asking for the modified document back, ask for a list of specific edits in sed-like format (or json if you prefer)
- Apply those edits locally without requiring the LLM to generate duplicate content
The syntax is beautifully straightforward: s/unique text marker/## Heading Text\n\n$0/
Where the unique text marker
identifies where to add the heading, and $0
preserves the original text. Of course you have to make sure that your edit is totally unique, but you can prompt the LLM to make sure that it really is.
But Does It Actually Work?
Yes, it does. To prove it, I created a small benchmark to measure the difference between the traditional approach and the edit trick. You can check out the full benchmarking code here.
I tested both approaches on the same document — adding section headings to a 7,000-character article. The numbers don’t lie — the results were striking:
Benchmark Results (Average)
┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Metric ┃ Full Approach ┃ Edit Trick ┃ Difference ┃
┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ Estimated Cost │ $0.031 │ $0.0096 │ $0.021284 (69%) │
│ Output Tokens │ 1702 │ 246 │ 1455 (86%) │
│ Input Tokens │ 1786 │ 1968 │ -182 (-10%) │
│ Processing Time │ 29.92s │ 6.17s │ 23.75s (79%) │
│ Headings Added │ 10 │ 9 │ 1 │
└─────────────────┴───────────────┴────────────┴─────────────────┘
- Costs shown are based on Claude 3.7 Sonnet pricing
- Input tokens increased for the edit trick because the prompt needs to include instructions for the edit format, which is slightly more complex than asking for a full rewrite.
That’s an 86% reduction in output tokens! Since output tokens are typically more expensive than input tokens, this translates to dramatic cost savings — almost 70% cheaper.
Processing Time: The edit trick completed in just 6 seconds compared to 30 seconds for the traditional approach. A 79% speed improvement means faster iterations and more responsive applications.
Quality Comparison: Both approaches successfully added section headings, with the traditional approach adding 10 headings and the edit trick adding 9. This difference occurred because the edit trick approach was more selective in identifying natural section breaks, while the full-document approach sometimes added headings even for smaller subsections. In other runs, the edit trick identified 7 or 11 headings depending on the granularity level, while the full approach consistently added 10. The quality of the headings themselves was comparable, with both approaches choosing clear and descriptive titles that accurately reflected the content.
When to Use This Technique
The edit trick shines in several scenarios:
- Document formatting tasks (adding headings, standardizing formatting)
- Text enhancement (expanding sections, adding citations)
- Content organization (restructuring paragraphs, adding section breaks)
- Any task where the original content mostly stays intact
It’s particularly valuable when working with longer documents that would otherwise bump against context window limitations.
Implementing It Yourself
The implementation is surprisingly simple. A sed-like edit pattern gives you tremendous flexibility while keeping the LLM’s task focused. The key is crafting a clear prompt that instructs the LLM to:
- Identify positions in the document that need modification
- Generate specific edit commands rather than the full document
- Use a consistent format that’s easy to parse programmatically
I found this works remarkably well across different LLMs, though Claude’s precision makes it particularly suited for generating these kinds of targeted edits.
The Bigger Picture: Efficient AI Engineering
The edit trick isn’t just about saving tokens — it’s about developing a more thoughtful, precise approach to working with LLMs. Just as a skilled carpenter knows when to use a chainsaw versus a handsaw, knowing when to apply the edit trick versus the brute-force approach makes you a more effective AI engineer.
As LLMs become increasingly integrated into our workflows, techniques like this will separate those who use AI efficiently from those who merely use it. The next time you’re about to send a large document to an LLM for modification, ask yourself: “Do I need the entire document back, or just the changes?” Your users — and your budget — will thank you.
Try implementing the edit trick in your next project and share your results. The code is open source, and I’m interested to see how others adapt and improve this technique for their own document processing workflows.