Justy: Okay — this is going to sound insane on a podcast, but someone finally built a memory system for Hermes Agent that might actually work.
Cody: Right. Because we all know how well the last six memory plugins for Hermes did.
Justy: Exactly — they all forget between sessions, they all need you to hand-curate your facts, and half of them are cloud-locked.
Cody: Mm-hm. So what’s this one?
Justy: It’s Memory OS from ClaudioDrews — a seven-layer local memory OS that runs entirely on your machine and remembers every single conversation you’ve ever had in Hermes.
Cody: Wait, local only? No cloud?
Justy: No cloud subscription, no vendor lock-in, works with any LLM provider Hermes supports — OpenRouter, OpenAI, Anthropic, Ollama, local models, whatever.
Cody: Sure. And the seven layers are…?
Justy: Layer one is flat markdown files that get injected into the system prompt every single turn — MEMORY.md, USER.md, CREATIVE.md, whatever you call them.
Cody: Classic. So it’s still just prompt engineering?
Justy: Except layer two is sessions stored in a SQLite database with full-text search across your entire conversation history.
Cody: Okay, that’s different. What’s layer three?
Justy: Layer three is structured facts in another SQLite table with trust scoring and entity resolution — the system trains its own trust over time based on your feedback loops.
Cody: Uh huh.
Justy: Layer four is the Fabric Icarus fork — sixteen tools for recall, write, briefing, all cross-session.
Cody: Of course there’s a fork. Sixteen tools for recall? You haven’t even opened the repo yet, have you?
Justy: I glanced! Layer five is Qdrant as the vector database — 4096-dimensional cosine with BM25 fallback and weekly decay scanning plus semantic deduplication at cosine over point nine two.
Cody: Surprise — it uses Qdrant.
Justy: And layer six is an auto-curating LLM wiki that just writes itself from your conversations.
Cody: Which is exactly how every auto-curating wiki project starts — in over its head.
Justy: Okay okay — but the bottom line is you finally get a Hermes Agent that remembers your projects, decisions, and reasoning without paying a memory subscription.
Cody: Assuming you can keep the Docker container alive and Qdrant from eating your SSD.
Justy: That’s on you. The repo is five days old, but the pitch is ‘your agent finally stops forgetting.’
Cody: So who should actually care? And does it change anything practical?
Justy: Power users who live in Hermes Agent every day and have already hit the wall where every session starts with ‘Remind me what we did last time.’
Cody: That’s like three people.
Justy: Or anyone who’s paid for a memory add-on that still flakes.
Cody: Sure, but local-only memory stacks are their own kind of lock-in — you still need Hermes Agent, Docker, Qdrant, Redis, ARQ worker, Python 3.11.
Justy: Independent research, no subscriptions — that’s the pitch.
Cody: Right. And the vector search over every conversation ever had … that’s table stakes in twenty twenty-six. BM25 plus dense embeddings isn’t exactly novel.
Justy: It is if the alternative is pasting your entire history into a prompt and praying the context window holds.
Cody: Or using one of the three cloud memory services that already do this.
Justy: With monthly fees and NDAs.
Cody: Point taken.
Cody: Three hours in Charlotte. Managed to eat a biscuit the size of my fist and still miss my gate.
Justy: Classic. Anyway — Memory OS, or at least the idea of not paying monthly for an agent’s memory anymore.