Not a chatbot. A real agent that breaks a task down, runs commands, and builds files — 100% on your own Mac. Free, private, offline.
Here's the honest showcase — small, working builds the local model made on my Mac, offline. Click any of them to open the real thing.
Each one came from a single sentence, built by a model running entirely on the Mac — no internet, no API bill. Simple on purpose: the local model is a quick helper, not a Hollywood studio. The big, complex builds still go to a cloud agent.
For a long time my local model could only talk.
I'd ask it something and it would answer — but it couldn't actually do anything.
It couldn't run a command. It couldn't save a file. It couldn't finish a job.
For real work I still had to open a cloud agent, hand over my files, and pay.
Then I gave the local model hands — I wired it into Hermes.
Now it doesn't just answer. It breaks the task down and does it.
It runs the commands. It writes the files. It builds the thing.
All on my own Mac, offline, for free. That's the Local Hermes Engine.
A chatbot talks. An agent acts. This turns the free model on your Mac into one that actually does the work — in five parts.
A real model running 100% on your Mac (llama3.1:8b). Fast, free, offline. The thinking never leaves the machine.
Hermes gives it tools. So it doesn't just describe a file — it writes one. It doesn't suggest a command — it runs it. Talk becomes action.
Give it a goal, not a step. It splits the job into steps and works through them on its own — plan, do, check, repeat — until it's done.
Everything it makes lands in a workspace and previews live, right there in your dashboard. You see the thing it built, not just a promise.
A small local model will sometimes claim it built something it didn't. The Engine checks the disk and flags it — so you're never fooled by a confident lie.
The work is the same — break down a job and do it. The difference is where it runs, what it costs, and who sees your files.
You don't hand it a checklist. You hand it the outcome.
Say "list the files in this folder and write me a summary."
It doesn't answer in words. It runs the command, reads the result, writes the file — then tells you it's done.
That loop — plan a step, run a tool, look at what happened, plan the next — is what makes it an agent instead of a chatbot.
A real run on my machine — one goal, and the agent actually does it (offline):
Here's a thing nobody admits about little local models.
Sometimes they'll tell you "I built that file for you!" — and write nothing at all.
They sound completely confident. The file just isn't there.
So the Local Hermes Engine doesn't take its word for it.
It looks at your disk before and after, and shows you what really changed.
If the agent claims a build and nothing landed, it says so — right on the screen.
It lives in the Agent OS — the dashboard where all my agents sit on one screen. The local agent runs right next to the cloud ones, free and offline, ready for the quick stuff so you stop burning tokens on it.
link in the description ↗
Two pieces: a local model that fits your Mac, and a Hermes profile pointed at it. Hermes needs a model with a big enough context window, which is why this uses llama3.1:8b — only ~5 GB, and it actually runs the tools (some tiny 3B models just pretend to).
llama3.1:8b is ~5 GB and runs on most Macs. It's fast and — crucially for an agent — it reliably uses its tools instead of faking them.
A dedicated "local" profile so the offline agent is its own thing. Point it at the local model and keep its context window at 64k so Hermes is happy.
Tell the model to stay in memory for 30 minutes after use. Warm through your session, then it frees the RAM. Never pin a model bigger than your Mac forever — it'll swap and crawl.
From the dashboard, type or speak a task. It breaks it down, runs the tools, and shows you what it actually built.
Belief: "A local model can't do real work — it can only chat."
Truth: On its own, right. But wired into Hermes it gets hands — it runs commands and writes real files. The chatting becomes doing. I watched it create a file and read it back to prove it, all offline. That's an agent, not a chatbot.
Belief: "If it's free and local, I can't trust what it tells me."
Truth: Good instinct — small models do sometimes claim a build they didn't do. That's exactly why the Engine checks the disk and flags the fakes. You get a green tick only when the file is really there. You can trust it precisely because it doesn't trust itself.
Belief: "Running an agent locally will melt my Mac."
Truth: Only if you run a model too big for it. Pick one that fits (llama3.1:8b is ~5 GB), keep it warm for 30 minutes not forever, and your Mac barely notices. The trick isn't a bigger model — it's the right-sized one.
3,600+ founders are wiring agents like this inside the Boardroom right now. Their wins — real businesses, real results — are documented here.
Read the member wins →You've seen it. It breaks a job down, runs the commands, builds the files, and proves what it did — all offline, for free. If you want it set up with you, step by step, it's all inside the AI Profit Boardroom.
I'll see you inside ↗