Justy: Okay, so I read this thing where someone spent May running ninety-three hideous documents through fourteen OCR engines just to answer one question: do you actually need to pay sixty-five bucks per thousand pages for structured output?
Cody: Right.
Justy: And the answer—surprise—is a hard maybe. Unless your docs are clean PDFs, in which case this isn’t even a conversation.
Cody: Or unless you like throwing money at AWS. I mean, Textract’s fine, but sixty-five per thousand?
Justy: Exactly. And she tested everything—old invoices, handwritten notes, tax forms, even scanned newspapers.
Cody: Mm-hm.
Justy: Anyway—my week was spent wrestling with expense reports, which, turns out, are the perfect real-world OCR stress test. Cody, how was the red-eye?
Cody: Ugh. Landed at four, crashed at the apartment, woke up to my neighbor’s dog howling at a garbage truck. So, living the dream.
Justy: So you’re saying your brain is also a garbage truck right now.
Cody: Fair. But fine, back to your expense reports. What’d this experiment actually prove?
Justy: That OCR’s not a winner-take-all game anymore. The specialist models crushed it on their specific thing—tables, handwriting—but fell apart the second the doc strayed. So her big takeaway? OCR’s a routing problem. Classify first, then pick the engine.
Cody: Okay, but that’s a lot of infrastructure for most teams. You’re telling me I need a classifier, a router, and then a fleet of models just to parse a PDF?
Justy: No, I’m telling you that if you’re paying for structured output on every single thing, you’re probably overspending. Route the messy stuff to the heavy hitters, send the simple scans to the cheap open-source models, and save a ton.
Cody: Right, right. And if your ‘messy stuff’ is fifty percent of your volume, suddenly you’re maintaining a whole pipeline just to shave a few cents off per page.
Justy: Cody, you’re doing the thing where you assume the worst-case scenario is the only scenario.
Cody: No, I’m doing the thing where I’ve seen ‘simple routing’ turn into a six-month project because someone didn’t account for the edge cases.
Justy: Okay okay. But her actual advice is just to test on your own data. Stop trusting public benchmarks. Run your docs through a few engines, see what fails, and then decide.
Cody: That part I’ll buy. I’ve seen Tesseract do great on clean text and then completely lose its mind on a rotated receipt.
Justy: And that’s the point. The market’s flooded with new options—small vision models, VLM’s, LlamaParse—but none of them are universal. So the only real answer is ‘it depends.’
Cody: Which, for a field that’s been around since the eighties, feels… underwhelming.
Justy: That is SUCH an Exploring Next take. We spend an hour nerding out on an OCR bake-off and your summary is ‘this is fine but depressing.’
Cody: I mean, it’s accurate.
Justy: Anyway, practical upshot: if you’re not testing on your own data, you’re flying blind. And if you’re paying for structured OCR on docs that don’t need it, you’re just throwing money at AWS.
Cody: Or at least at the problem.
Justy: There it is. Alright, I’m gonna go pretend my expense reports are someone else’s problem.
Cody: Good luck with that.