The Epstein Files Were "Redacted" — People Unredacted Them in Seconds
The DOJ's Epstein files could be unredacted with copy and paste. The same flaw exists in most PDF redaction workflows. Here's how to check if your documents have the same vulnerability.
RedactBox Team
5 February 2026
8 min read
A 10-Second Test That Should Worry You
Open one of your recently redacted PDFs. Click on a black redaction box. Try to select the text underneath. Now paste it into Notepad or any text editor.
If readable text appears, your redaction is not a redaction. It is a black rectangle sitting on top of fully intact, fully extractable content. Every document you have ever "redacted" this way could be reversed by anyone with a mouse.
This is not a theoretical risk. In January 2026, the U.S. Department of Justice released 3.5 million pages of Epstein case documents with redactions applied. Within hours, people on social media were reversing the redactions using nothing more than copy and paste. The identities of at least 31 child victims — the very people the redactions were meant to protect — were exposed.
As one attorney representing survivors told NBC News, the DOJ had failed to redact the identities of people who were "victimised as children." A Wall Street Journal review found at least 43 victims' full names exposed — some appearing over 100 times across the database, with home addresses visible through simple keyword searches.
Why Black Boxes Fail
To understand the problem, you need to understand how PDFs actually work. A PDF is not a flat image. It is a structured file with multiple layers — text streams, annotation layers, image layers, and metadata. When most tools "redact" a PDF, they add a black rectangle as a new annotation layer on top of the existing text. The two layers render together on screen, creating the appearance that the text is gone.
But the text layer is completely untouched. It is still searchable, still selectable, and still embedded in the file. Any PDF reader, text extraction tool, or even a simple Ctrl+A will expose it.
The PDF Association's forensic analysis of the Epstein documents confirmed exactly this pattern — black rectangles placed over text without any modification to the underlying content stream.
This Has Happened Before (And It Will Happen Again)
The Epstein files were not the first high-profile redaction failure:
- In 2019, lawyers in the Paul Manafort trial filed court documents with black-box redactions. Journalists copied and pasted the "redacted" sections to reveal details about his contacts with Russian intelligence.
- In 2011, the TSA published a redacted security manual. The redactions were PDF annotations that could be removed entirely, exposing airport screening procedures.
- In 2005, the Italian government published a report on the killing of an intelligence agent in Iraq. Highlighted text that appeared redacted was easily recovered from the Word document metadata.
The same class of error, repeated across two decades. The tools changed. The mistake did not.
What "Permanent Redaction" Actually Means
Genuine redaction is not about hiding content. It is about destroying it. The original text must be removed from the PDF's content stream entirely — not covered, not overlaid, not annotated. Deleted.
But even deleting the text is not always enough. The final PDF must also be flattened — all layers merged into a single output where no separate text stream, annotation, or metadata layer exists. A flattened PDF is effectively a series of images. There is nothing to select, nothing to copy, nothing to extract.
Here is the difference at a glance:
| Black Box Overlay | Permanent + Flattened | |
|---|---|---|
| Text underneath | Still exists, fully intact | Permanently removed from file |
| Copy-paste test | Reveals hidden text | Nothing to select or copy |
| PDF text extraction | Extracts all original content | No text layer exists |
| Annotation removal | Removes the black box entirely | No annotations — single flat layer |
| Reversibility | Trivially reversible by anyone | Irreversible — original data destroyed |
The Email Archive Problem
The Epstein files were not standalone PDFs. They originated from email archives — millions of messages exported from email systems and converted to PDF for disclosure. This is the same workflow faced by UK organisations processing Subject Access Requests and FOI responses.
Email archives introduce additional complications that standalone PDF redaction does not address:
- Volume: A single MBOX file might contain thousands of emails. Manually redacting each one as a PDF is impractical and error-prone at scale.
- Consistency: The same name, phone number, or email address may appear across hundreds of messages. Missing even one instance means the redaction is incomplete.
- Format conversion: Emails in MBOX or PST format must be converted to PDF before disclosure. If the conversion tool does not handle redaction natively, you end up with a fragile multi-step process where content can slip through.
- Embedded content: Emails contain headers, metadata, attachments, and forwarded threads. Each element is a potential source of unredacted personal data.
This is exactly why the DOJ's failure was so extensive. Processing millions of email-sourced documents without a proper redaction pipeline guaranteed that inconsistencies would appear across the release.
A Practical Checklist for UK Organisations
If your organisation discloses redacted documents — for SARs, FOI requests, legal proceedings, or internal investigations — run through this checklist:
- Run the copy-paste test. Open a recently redacted PDF. Try to select and copy text from a redacted area. If anything pastes, your process is broken.
- Check your export is flattened. Open the PDF in a text editor (not a PDF viewer). Search for any readable strings that should have been redacted. If you find them, the PDF is not properly flattened.
- Verify consistency across documents. If you redacted a name in one email, search for that name across all other documents in the disclosure. Inconsistent redaction is the most common failure in large archives.
- Check metadata. PDF properties, author fields, and embedded comments can all contain personal data that survives visual redaction. Ensure your export process strips metadata.
- Test with a fresh pair of eyes. Have someone who was not involved in the redaction attempt to find unredacted content. Fresh perspective catches what familiarity misses.
How RedactBox Approaches This
We built RedactBox around the email disclosure workflow specifically because it is where redaction failures are most likely to occur — and most damaging when they do.
- Flattened PDF exports as standard. Every PDF exported from RedactBox is flattened. There is no text layer to extract, no annotation to remove, no hidden content to recover. This is not an option you need to remember to tick — it is how the system works. See our full feature overview.
- Native email archive processing. Upload MBOX or PST files directly. No manual PDF conversion step where content can be missed.
- Search & Redact across entire archives. Search for a name, phone number, or phrase across every email in your project and redact all matches in one action. This eliminates the consistency problem — if it appears anywhere, it gets caught.
- Highlight to Redact with archive-wide modes. Select text in any email and choose to redact it in the current message only, or across every email containing matching text.
- Standalone PDF redaction too. Upload individual PDFs for redaction with the same permanent, flattened output.
The Bottom Line
The Epstein files failure was not a technical edge case. It was the predictable result of using visual overlays instead of permanent redaction, applied across millions of documents with no consistency checks. The same risk exists in any organisation that processes email disclosures using general-purpose PDF tools.
Run the copy-paste test on your last disclosure. If it passes, you are in good shape. If it does not, it is worth addressing before your next SAR deadline.
Try RedactBox free — upload an MBOX, PST, or PDF file and see how permanent, flattened redaction works.
Need help with email redaction?
RedactBox makes it easy to redact sensitive information from email archives and export professional PDFs.