Izzo: So here’s one that’s been making the rounds — The team behind continuous batching says your idle GPUs should be running inference, not sitting dark.
Izzo: You’re listening to Exploring Next. I’m Izzo, and Boone’s here. Let’s get into it.
Boone: Yeah, this caught my attention because training jobs finish, workloads shift and hardware sits dark while power and cooling costs keep running.
Izzo: From a product standpoint, the interesting question is who actually ships with this. when the operator's scheduler needs hardware back, the inference workloads are preempted and GPUs are returned.
Boone: Right, and technically friendliAI's engine is written in C++ and uses custom GPU kernels rather than Nvidia's cuDNN library.
Izzo: Okay so what should people actually go try? The original source is a good starting point: https://venturebeat.com/infrastructure/the-team-behind-continuous-batching-says-your-idle-gpus-should-be-running
Boone: Definitely read that first. And if you want to go deeper, look into related tools in the same space — build something small and see where it breaks.
Izzo: Good call. That’s the episode — we’ll catch you on the next one.