Justy: Okay so you sent me this LangSmith Sandboxes post at like eleven PM and I could already tell from the link you were annoyed about it.
Cody: I wasn't annoyed, I was — okay, I was a little annoyed. I'd just gotten back from the grocery store, I was tired, and I read the headline and thought, great, another team calling containers dangerous so they can sell you a managed runtime.
Justy: Classic. How was the grocery store, by the way?
Cody: Terrible. They were out of the good oat milk again. Anyway — the post.
Justy: The post! Right. So walk me through what actually bothered you, because I read it and thought it was pretty solid.
Cody: So the core claim is: containers share a kernel, kernels can be exploited, therefore you need microVM isolation for agent code execution. And that argument is technically correct. The Copy Fail CVE they cite — a 732-byte Python script that roots every major Linux distro back to 2017 through the kernel crypto API — that's real. AI tooling found it in about an hour, which is a genuinely scary sentence.
Justy: Right, right.
Cody: But microVMs aren't new. Firecracker came out of AWS in 2018. The isolation primitive they're describing has existed for years. What they're actually shipping is a managed wrapper around that concept, integrated into the LangSmith platform. Which is fine! But the framing makes it sound like they invented the security model.
Justy: Okay but does that matter? Like — I take your point that Firecracker exists, but how many teams building agents right now are actually standing up their own microVM infrastructure?
Cody: Not many. That's fair.
Justy: Right, so the product question isn't 'did they invent the thing,' it's 'did they make the thing usable without a dedicated infra team.' And when I read the monday.com quote — their AI assistant writing and running code, generating multimedia from data analysis — that's a real production workload that someone had to solve the runtime layer for.
Cody: Yeah, I don't actually disagree with that. The Shai-Hulud worm example they use is also real — that npm supply chain attack backdoored 500-plus packages in a preinstall hook, before any tests ran. If your agent is pip-installing or npm-installing things at runtime, you have a genuine exposure. Containers don't save you there.
Justy: Mm-hm.
Cody: Where I get skeptical is the framing that this is the universal answer for agent workloads. The n8n RCE CVEs they mention — six in one day, including a 9.9 CVSS score — those are real, but n8n is a specific kind of tool with a specific execution model. Conflating that with 'therefore all agent code execution is equally dangerous' is a bit of a scare stack.
Justy: That's a fair read. Though I think the piece isn't really arguing that every agent is under attack — it's more like, if you're building something where a model generates and runs code, you have no idea what that code does, and running it in a container on your host is a category error. The container was built for known, vetted workloads. Agents are the opposite of that.
Cody: That framing I actually buy. Stateful little computers that install packages, keep long-running sessions, and execute untrusted code — that's a different execution model than a stateless web server, and containers genuinely weren't designed for it.
Justy: Okay so we're basically agreeing on the security argument. Where does your skepticism actually land then, Cody?
Cody: The moat question. The isolation is real, the CVE examples are real, the GA features — copy-on-write forks, snapshots, the auth proxy with custom callbacks so secrets never touch the runtime — those are genuinely useful primitives for parallel agent workflows. But all of this is valuable primarily if you're already deep in the LangSmith ecosystem. If you're not, you're not just adopting a sandbox, you're adopting a platform.
Justy: Sure, but I think that's true of most developer tooling at this layer. The question for a team evaluating this is: do I want to build and maintain the runtime layer myself, or do I want it bundled with my tracing, evals, and deployment pipeline? For a lot of teams, that bundled answer is actually the right one.
Cody: Yeah. And the fork model is honestly clever — copy-on-write so spinning up ten parallel branches costs about the same as one. When your agent takes a wrong path you restore a snapshot and try a different branch. That's a real workflow primitive for evaluation and debugging, not just marketing copy.
Justy: Did you just talk yourself into liking it?
Cody: I like the engineering. I'm still side-eyeing the 'containers are dangerous' headline energy. Those are two different things.
Justy: That is such a you sentence. Okay so where do we actually land — who should care about this?
Cody: Teams shipping coding agents or data analysis agents in production. If your agent generates code and runs it, the isolation argument is legitimate and the managed runtime is probably worth it. If you're building a chatbot that calls a few APIs, this is not your problem.
Justy: And if you're already in LangSmith, this is kind of a no-brainer to at least evaluate — it uses the same SDK and API key, so the switching cost is low. If you're not in LangSmith, you're making a platform bet, which is a different decision.
Cody: Yep. Solid product for a real use case, slightly over-rotated security framing. That's my read.
Justy: I'll take it. That might be the fastest we've ever agreed on anything — episode 405 and we're finally calibrated.
Cody: Don't tell anyone. It'll ruin the show.
Justy: Alright — if you want to poke at it yourself, they've got a CLI, the SDK is the same one you're already using if you're on LangSmith, and the docs are linked from the GA post. Worth a look if you're in that space. We'll see you next time.