Cohere just dropped North Mini Code — a free coding model that out-scores models 4× its size. And it runs 100% on your own Mac.
For months I paid for a coding AI every single day.
I tried free local coders, but they were slow or made junk.
So I figured good coding AI just had to be the big paid kind.
Then Cohere dropped North Mini Code — free, made only for coding.
I pulled it onto my Mac and asked it to build me a few apps.
It answered at 92 words a second — faster than anything I'd run.
And it beats models four times its size at writing code.
Now there's a free coding engine living inside my Mac.
It builds tools, pages and games for me — offline, no bill.
I just say what I want, and watch it appear. That changed everything.
A free coding model that lives inside your own Mac — fast, private, made for building. Five things make it work.
North Mini Code is trained for one job: writing software. Not a general chatbot dabbling in code — a model built to build.
30B in size but only a few experts fire per word — so it's quick on your Mac, yet out-scores models 4× larger at code.
Turn its reasoning on for tricky jobs, or off for instant builds. You pick speed or depth, per task.
Apache-2.0 licensed and offline. No bill, no tokens, no limits — and nothing you type ever leaves the machine.
Wired into your dashboard, you just say "build me a calculator" — it writes it, previews it live, and saves it.
Here's the shift. Most people rent a coding AI by the month and send every line to a server. North flips that.
Here's the trick that makes North fast on a laptop.
Old models wake their whole brain for every single word. Big means slow.
North is split into 128 experts, and only 8 of them fire for each word.
Giant brain, tiny effort. That's how a 30B model runs as quick as a small one.
On my Mac: North runs at ~92 words a second and takes about 19 GB of memory — light enough to leave plenty of room on a 36 GB machine. Compare that to a dense 28 GB model that crawled at 16. Right shape beats big size.
I asked North to build 16 things in a row — games, tools, pages, visuals.
It cranked them out fast, about 20–40 seconds each.
The tools and pages were genuinely good — a full scientific calculator, a todo app, a pricing page, a drum machine.
The trickier games and animations were hit-or-miss — some played, some needed a fix.
That's the honest picture: no local model one-shots everything. North nails the well-defined builds and gets you 80% there on the hard ones — fast, free, and on your machine.
Two small things make North feel instant and stay private.
First: keep it warm in memory, so it never reloads. The lag people blame on "slow models" is just the cold start.
Second: it runs on a loop inside your Mac and never phones home. I checked — while it built a whole app, not one byte left the machine.
I put North Mini Code inside the Agent OS — voice in, builds previewing live, every app saved in one workspace. It sits next to my cloud agents as the free, fast coder.
link in the description ↗
North needs the latest Ollama (it's a brand-new model). Update first, then four steps.
North is new, so grab the latest from ollama.com. Older versions can't run it yet.
The q4 version is about 19 GB. Fits a 16–36 GB Mac. After this it's yours, offline.
One setting tells it to stay in memory instead of reloading. This is what makes it instant.
Send a quick "hello" so it loads. Check it's warm — and you're live.
Lighter Mac? gpt-oss:20b (OpenAI, ~12 GB, ~75 words/sec) is a great lighter pick with no thinking mode. Both run the exact same way.
Belief: "A free local model can't really code."
Truth: North Mini Code out-scores models 4× its size on a public coding benchmark — and it's free, Apache-2.0. On my Mac it built a full scientific calculator, a todo app and a pricing page from one sentence each. The "free can't code" rule just broke.
Belief: "A 30 GB-class model will be way too slow on my Mac."
Truth: Backwards. North only fires 8 of its 128 experts per word, so it runs at ~92 words a second — faster than the small models. The slow ones are the old dense models that wake their whole brain. Right shape, not small size.
Belief: "Setting up a local model is too technical for me."
Truth: Update one app, then three short commands. If you can copy and paste a line, you can run this. Ten minutes, start to finish — and then it's free forever.
3,600+ founders are running fast local coders like this inside the Boardroom right now. Their wins — real businesses, real results — are documented here.
Read the member wins →You stopped paying. A free coder, Apache-2.0, runs on the Mac you own — that monthly bill is gone.
You out-punch the giants. A 30B model that beats 120B ones at code, on your laptop.
You stopped waiting. Only 8 experts fire per word — ~92 words a second, instant builds.
You stopped leaking. Nothing you type ever leaves your Mac — proven, zero connections out.
You pick speed or depth. Thinking off for instant builds, on for the tricky jobs.
You build by voice. Say it, watch a whole app appear live, and it saves itself.
You've seen it. North Mini Code is free, fast, made for code, and runs offline. If you want it set up with you — wired into a full dashboard, step by step — it's all inside the AI Profit Boardroom.
I'll see you inside ↗