E
Epstein Suite

System Status: Active Ingestion

Live processing: OCR, AI summaries, and data indexing in progress across ~3.5 million newly released pages.

Ask Epstein Files
Ask Epstein Files Chat with the archive
Feedback Suggest improvements

Product Roadmap

Where Epstein Suite is headed

We’re building a transparent, searchable archive of the DOJ releases with modern tooling. Here’s a running log of what shipped and what’s next.

Last updated: December 21, 2025
Drive
10,157 files
Mail
1,247 emails
Contacts
2,345 people
Flights
20 logs
Photos
Images + scans
Analytics
Live stats

Highlights

What we’ve already accomplished

Ingestion & search

  • Automated download + OCR pipeline with retry + error dashboards
  • AI summaries aligned with OCR text and entity linking
  • Full-text Drive search with source linking + manual file pulls

Apps people can use now

  • Epstein Drive, Mail, Contacts, Flights, Photos, Analytics
  • Live entity graphing + related document discovery
  • Source verification + broken-file reporting with admin alerts

Shipped · Dec 20, 2025

Core ingestion & visibility

  • Built detailed ingestion health views (Sources dashboard + Ingestion Progress live view) plus dataset-specific analytics.
  • Added retry APIs & UI (single + bulk) with ingestion error logging per step.
  • Upgraded Drive browsing with OCR status filters, folder fallbacks, and Feather icons.

Shipped · Dec 21, 2025

Reliability, Ask beta & user tools

  • Manual “Create local copy” button + API for missing files, plus MIME sniffing to render PDFs vs images correctly.
  • Logging hardened by relocating download logs under /storage/logs.
  • Broken file reporting with admin notifications and Drive/source visibility polish across the suite.
  • AI summary cards + entity chips revamped to match the Suite aesthetic you see in document view.
  • Launched Ask EpsteinSuite (beta) at /ask.php—fully logged chat with document citations and ingestion awareness.

In Progress

Complete the ingestion backlog

Finish downloading every file locally so OCR + AI can work without falling back to remote sources. The ad-hoc download button and nightly scripts are converging on zero “remote-only” gaps.

Next Up · Targeting Q1 2026

Researcher APIs & programmatic access

  • Ship authenticated REST APIs that return documents, entities, flights, and emails with pagination + webhooks.
  • Provide streaming Ask responses via API so researchers can integrate the RAG pipeline into their own apps.
  • Harden rate limits + audit logs so external clients stay compliant with privacy and safety requirements.
  • Expand ingestion monitors so API consumers can subscribe to dataset-specific change events.

Have input?

Email admin@kevinchamplin.com with feature requests or datasets we should ingest next.