New · a real agent that runs on your Mac The framework · Hermes + your local model

The Local Hermes Engine.

Not a chatbot. A real agent that breaks a task down, runs commands, and builds files — 100% on your own Mac. Free, private, offline.

ALL INSIDE YOUR MAC · OFFLINE You give it a goal · voice or text Hermes breaks it down goal → steps It runs the tools commands · writes files It's built — verified checked on disk · previewed live No cloud · no cost · nothing leaves your Mac 🔒
You give it a goal → it plans → it runs the tools → it builds the thing. Every step on your own machine.
What it built · open + play with them

Real things, built by the local model. Touch them.

Here's the honest showcase — small, working builds the local model made on my Mac, offline. Click any of them to open the real thing.

Neon clock
"a glowing neon digital clock"
open fullscreen ↗
Breathing neon orb
"a glowing orb that pulses forever"
open fullscreen ↗
Tip + split calculator
"tip %, split, total per person"
open fullscreen ↗

Each one came from a single sentence, built by a model running entirely on the Mac — no internet, no API bill. Simple on purpose: the local model is a quick helper, not a Hollywood studio. The big, complex builds still go to a cloud agent.

3,600+Founders inside AIPB
400kYouTube subscribers
163kX / Twitter followers
38Countries · live members

For a long time my local model could only talk.

I'd ask it something and it would answer — but it couldn't actually do anything.

It couldn't run a command. It couldn't save a file. It couldn't finish a job.

For real work I still had to open a cloud agent, hand over my files, and pay.

Then I gave the local model hands — I wired it into Hermes.

Now it doesn't just answer. It breaks the task down and does it.

It runs the commands. It writes the files. It builds the thing.

All on my own Mac, offline, for free. That's the Local Hermes Engine.

The framework · the Local Hermes Engine

The Local Hermes Engine™.

A chatbot talks. An agent acts. This turns the free model on your Mac into one that actually does the work — in five parts.

i.

The Local Brain

A real model running 100% on your Mac (llama3.1:8b). Fast, free, offline. The thinking never leaves the machine.

ii.

It Has Hands

Hermes gives it tools. So it doesn't just describe a file — it writes one. It doesn't suggest a command — it runs it. Talk becomes action.

iii.

It Breaks It Down

Give it a goal, not a step. It splits the job into steps and works through them on its own — plan, do, check, repeat — until it's done.

iv.

You Watch It Build

Everything it makes lands in a workspace and previews live, right there in your dashboard. You see the thing it built, not just a promise.

v.

It Tells The Truth

A small local model will sometimes claim it built something it didn't. The Engine checks the disk and flags it — so you're never fooled by a confident lie.

Old way vs new way

A cloud agent vs an agent you own.

The work is the same — break down a job and do it. The difference is where it runs, what it costs, and who sees your files.

Old way · cloud agent
$ per task · needs internet
  • Every command and file is sent to someone else's server
  • You pay per token, every single run
  • No internet, no agent — it just stops
  • Rate limits cut you off mid-job
  • You trust it built the thing — you can't see the disk
  • Result: a powerful agent you rent and don't control
New way · the Local Hermes Engine
$0 · fully offline
  • Every step runs on your own Mac — nothing is sent out
  • No bill, no tokens, no limits — run it all day
  • Works on a plane, in a cafe, with the wifi off
  • It breaks the goal into steps and does them itself
  • It checks the disk and shows you exactly what it built
  • Result: a real agent you own, free and private
How it works · plan, do, check

One goal in. A finished job out.

You don't hand it a checklist. You hand it the outcome.

Say "list the files in this folder and write me a summary."

It doesn't answer in words. It runs the command, reads the result, writes the file — then tells you it's done.

That loop — plan a step, run a tool, look at what happened, plan the next — is what makes it an agent instead of a chatbot.

ONE GOAL → MANY STEPS Your goal "summarise this folder" › runlist the files › readwhat's in them › writesummary.md Done ✓ file on your disk
It plans the steps, runs each tool, checks the result — and only says "done" when the file is really there.

A real run on my machine — one goal, and the agent actually does it (offline):

you › create clock.html — a live digital clock — then read it back to confirm
engine › › wrote clock.html (945 bytes)
› read it back — contents verified
✓ done — clock.html is in your workspace ← the file is really there
The honest part · no fake wins

Small models lie. This one gets caught.

Here's a thing nobody admits about little local models.

Sometimes they'll tell you "I built that file for you!" — and write nothing at all.

They sound completely confident. The file just isn't there.

So the Local Hermes Engine doesn't take its word for it.

It looks at your disk before and after, and shows you what really changed.

If the agent claims a build and nothing landed, it says so — right on the screen.

CLAIM → CHECK THE DISK → TRUTH The agent says "I built it! ✨" 🔎 Check the disk did a file really appear? ✓ Built clock.html — it's real ⚠ It claimed a build — nothing landed
Green when the file is really on disk. A clear warning when the agent talked big and built nothing. No fake wins.
Set it up with us

Want the Local Hermes Engine on your Mac?

It lives in the Agent OS — the dashboard where all my agents sit on one screen. The local agent runs right next to the cloud ones, free and offline, ready for the quick stuff so you stop burning tokens on it.

Get the Agent OS →

link in the description ↗

Do it yourself · the setup

Wire it in — about ten minutes.

Two pieces: a local model that fits your Mac, and a Hermes profile pointed at it. Hermes needs a model with a big enough context window, which is why this uses llama3.1:8b — only ~5 GB, and it actually runs the tools (some tiny 3B models just pretend to).

Pull a light, capable model.

llama3.1:8b is ~5 GB and runs on most Macs. It's fast and — crucially for an agent — it reliably uses its tools instead of faking them.

Make a Hermes profile pointed at it.

A dedicated "local" profile so the offline agent is its own thing. Point it at the local model and keep its context window at 64k so Hermes is happy.

Keep it warm — for a while, not forever.

Tell the model to stay in memory for 30 minutes after use. Warm through your session, then it frees the RAM. Never pin a model bigger than your Mac forever — it'll swap and crawl.

Give it a goal.

From the dashboard, type or speak a task. It breaks it down, runs the tools, and shows you what it actually built.

# 1. a light model that runs tools reliably (~5 GB)
ollama pull llama3.1:8b

# 2. a dedicated offline Hermes profile
hermes profile create local
# point it at llama3.1:8b, context 64k, in the profile config

# 3. stay warm for 30 min after use (NOT forever)
launchctl setenv OLLAMA_KEEP_ALIVE 30m

# 4. give it a real job (runs offline, with tools)
local -z "create notes.md with my 3 top tasks, then read it back"
The honest limits ↓
Should you do this · what holds people back

The three things people get wrong — backwards.

Belief: "A local model can't do real work — it can only chat."

Truth: On its own, right. But wired into Hermes it gets hands — it runs commands and writes real files. The chatting becomes doing. I watched it create a file and read it back to prove it, all offline. That's an agent, not a chatbot.

Belief: "If it's free and local, I can't trust what it tells me."

Truth: Good instinct — small models do sometimes claim a build they didn't do. That's exactly why the Engine checks the disk and flags the fakes. You get a green tick only when the file is really there. You can trust it precisely because it doesn't trust itself.

Belief: "Running an agent locally will melt my Mac."

Truth: Only if you run a model too big for it. Pick one that fits (llama3.1:8b is ~5 GB), keep it warm for 30 minutes not forever, and your Mac barely notices. The trick isn't a bigger model — it's the right-sized one.

Don't take my word for it

3,600+ founders are wiring agents like this inside the Boardroom right now. Their wins — real businesses, real results — are documented here.

Read the member wins →
Your turn

Give the free model on your Mac a pair of hands.

You've seen it. It breaks a job down, runs the commands, builds the files, and proves what it did — all offline, for free. If you want it set up with you, step by step, it's all inside the AI Profit Boardroom.

Get the Agent OS →

I'll see you inside ↗