Justy: Cody, this one matters because a lot of teams are still basically saying, why is my app slow when the real answer is just Postgres writes got expensive.
Cody: Yeah. My skeptical read is the headline is probably directionally real, but the number is doing a lot of work. They're fixing a known Postgres pain point, not inventing new physics.
Cody: The pain point is durability overhead. In normal Postgres, after a checkpoint, the first write to a page can force a full 8KB page image into WAL so crash recovery doesn't replay onto a half-written page. On write-heavy systems that balloons log traffic, and they say it can go up by like 15x in bad cases.
Justy: Right.
Cody: Lakebase gets to dodge that because compute is stateless and storage is separate. The compute streams WAL to a Paxos-backed safekeeper quorum, so there isn't a local data page sitting on disk that can get torn in the old-school way.
Justy: I had to reheat my coffee because I made the mistake of letting you explain WAL before caffeine. Anyway, the user story is pretty clean. Small team, app works, traffic grows, writes start hurting, and nobody wants to become a database intern over the weekend.
Justy: If this really gives more headroom without changing app code, that's a real product story. Especially for people building transactional backends or those AI app stacks where Postgres ends up holding session state, metadata, tool traces, all the unglamorous stuff.
Cody: Sure.
Cody: The clever part is they didn't just turn off full page writes and call it a day. That would save WAL bandwidth, but reads could get ugly because storage might need to replay a very long chain of tiny deltas to reconstruct one page.
Cody: So their pageserver generates page images in storage once a page crosses some delta threshold. I actually like that. It's more grounded than tying image creation to checkpoint timing, which is kind of a blunt instrument.
Justy: That part felt important to me too. The claim isn't only faster writes. They also say around 94 percent less WAL traffic and about 2x better read tail latency, which is exactly where product claims usually get slippery.
Cody: Mm-hm.
Justy: If both sides move the right way, then okay, this is more than benchmark theater. But I still think adoption depends on whether a team believes the storage layer is now smart in a good way, not smart in a mysterious way.
Cody: Exactly.
Cody: That's my real reservation, Justy. The complexity did not vanish. It moved. Now you need confidence in pageserver behavior, image thresholds, replay costs, branch interactions, and what happens under weird contention patterns.
Cody: The blog says image generation can be shared across multiple pageservers in the background, which sounds great for scale. I just want to know what the observability looks like when one hot page goes pathological, because that's where managed systems get annoying.
Justy: Wait—
Justy: I think that's fair, but for the buyer, some of that is exactly the point. They are paying to not own that weirdness. The adoption barrier is less technical purity and more, do I trust this enough to move a production app that already works.
Cody: Yeah.
Justy: And there's some market timing here. Everybody wants one operational data system that can sit near analytics and AI workflows without doing a giant platform split. So 'Postgres, but with more write headroom because the architecture is different' lands pretty well right now.
Cody: I could be wrong, but I buy the mechanism more than I buy the generic hype. No torn local pages means full page writes become optional at compute. Then storage recreates the reset points on its own terms. That's coherent engineering.
Cody: What I would not assume is automatic 5x for every workload. If the app is read-heavy, lock-heavy, or bottlenecked somewhere else, this probably feels a lot less dramatic than the headline.
Justy: So, classic episode 394 energy. The number on the billboard is loud, the actual idea is quieter and maybe better. Best fit feels like teams already committed to Postgres semantics, hitting write pressure, and willing to buy into a managed architecture shift instead of sharding themselves into sadness.
Cody: Build-next wise, I'd do two things. One, run pgbench locally against vanilla Postgres with full_page_writes on and off, plus different checkpoint settings, just to feel the WAL amplification yourself. Two, try a disaggregated Postgres service with a Neon-style storage model and compare write throughput, WAL volume, and p99 reads under a write-heavy script.
Justy: For a solo builder, even simpler: spin up a small app that writes chat history or order events, hammer it with pgbench or k6, and watch whether the bottleneck is actually WAL traffic before shopping for magic. Anyway, I think that's the honest version. Smart architecture, real use case, still needs trust.
Cody: Yep. Good trick, not magic.
Justy: Cool. Finish your coffee before you start explaining Paxos at my kitchen counter again.