The Local AI Coding Engine — North Mini Code: a free coder that beats models 4× its size, on your Mac

Straight from Cohere · official sources

Read it + run it yourself ↓

the announcementIntroducing North Mini Code → the modelNorth Mini Code on Hugging Face → run it locallyNorth Mini Code on Ollama → who made itCohere →

"North Mini Code scores 33.4 on Artificial Analysis' Coding Index, outperforming similarly sized open models… as well as substantially larger models including Nemotron 3 Super (120B), Mistral Small 4 (119B), and Devstral 2 (123B)."— Cohere Labs, North Mini Code release, June 2026

3,600+Founders inside AIPB
400kYouTube subscribers
163kX / Twitter followers
38Countries · live members

For months I paid for a coding AI every single day.

I tried free local coders, but they were slow or made junk.

So I figured good coding AI just had to be the big paid kind.

Then Cohere dropped North Mini Code — free, made only for coding.

I pulled it onto my Mac and asked it to build me a few apps.

It answered at 92 words a second — faster than anything I'd run.

And it beats models four times its size at writing code.

Now there's a free coding engine living inside my Mac.

It builds tools, pages and games for me — offline, no bill.

I just say what I want, and watch it appear. That changed everything.

The framework · the Local AI Coding Engine

The Local AI Coding Engine™.

A free coding model that lives inside your own Mac — fast, private, made for building. Five things make it work.

A Coder, Not A Chatbot

North Mini Code is trained for one job: writing software. Not a general chatbot dabbling in code — a model built to build.

ii.

Small Body, Big Punch

30B in size but only a few experts fire per word — so it's quick on your Mac, yet out-scores models 4× larger at code.

iii.

Thinking On Demand

Turn its reasoning on for tricky jobs, or off for instant builds. You pick speed or depth, per task.

iv.

Free + Yours

Apache-2.0 licensed and offline. No bill, no tokens, no limits — and nothing you type ever leaves the machine.

Build By Voice

Wired into your dashboard, you just say "build me a calculator" — it writes it, previews it live, and saves it.

Old way vs new way

Paying for a coder vs owning one.

Here's the shift. Most people rent a coding AI by the month and send every line to a server. North flips that.

Old way · renting a cloud coder

$20–200/mo

Pay every month for a coding assistant
Every line of code leaves your machine
Hit limits mid-build, wait for the reset
Tried free local coders — slow, or wrote junk
Decided you need the big paid model
Result: you rent your coder and hand over your code

New way · North Mini Code on your Mac

$0/mo

A free coder built only for software
Beats models 4× its size at writing code
92 words a second on a normal Mac — no limits
Nothing you type ever leaves the machine
Say the word and a whole app appears, live
Result: a fast, private coding engine you own

Why a 30B model is quick

It only wakes a few experts per word.

Here's the trick that makes North fast on a laptop.

Old models wake their whole brain for every single word. Big means slow.

North is split into 128 experts, and only 8 of them fire for each word.

Giant brain, tiny effort. That's how a 30B model runs as quick as a small one.

128 experts in the model. Only the 8 best for each word light up. Big brain, small effort, fast answers.

On my Mac: North runs at ~92 words a second and takes about 19 GB of memory — light enough to leave plenty of room on a 36 GB machine. Compare that to a dense 28 GB model that crawled at 16. Right shape beats big size.

What I built with it · honest

I gave it 16 jobs. Here's the truth.

I asked North to build 16 things in a row — games, tools, pages, visuals.

It cranked them out fast, about 20–40 seconds each.

The tools and pages were genuinely good — a full scientific calculator, a todo app, a pricing page, a drum machine.

The trickier games and animations were hit-or-miss — some played, some needed a fix.

That's the honest picture: no local model one-shots everything. North nails the well-defined builds and gets you 80% there on the hard ones — fast, free, and on your machine.

you › build me a scientific calculator
north › Here's a self-contained calculator ↓
<!DOCTYPE html> … keypad · operators · sin/cos/tan/log/√/π · live display
→ previewed live · saved to your workspace · ~38s

Fast + private · the two settings

Keep it warm. Keep it home.

Two small things make North feel instant and stay private.

First: keep it warm in memory, so it never reloads. The lag people blame on "slow models" is just the cold start.

Second: it runs on a loop inside your Mac and never phones home. I checked — while it built a whole app, not one byte left the machine.

# warm + listening only on your own machine
$ ollama ps
north-mini-code-1.0 19 GB 100% GPU warm
$ lsof -iTCP -c ollama | grep -v 127.0.0.1
(nothing) ← zero connections out. fully offline.

Set it up with us

Want North wired into your dashboard?

I put North Mini Code inside the Agent OS — voice in, builds previewing live, every app saved in one workspace. It sits next to my cloud agents as the free, fast coder.

The full Agent OS — every agent in one place, builds previewing live
A 30-day roadmap to run a free, private coder of your own
Four coaching calls a week with people building this in production
A room of 3,600+ founders — someone's online 24/7

Get the Agent OS →

link in the description ↗

Do it yourself · the setup

Run North in about ten minutes.

North needs the latest Ollama (it's a brand-new model). Update first, then four steps.

Update Ollama.

North is new, so grab the latest from ollama.com. Older versions can't run it yet.

Pull North — once.

The q4 version is about 19 GB. Fits a 16–36 GB Mac. After this it's yours, offline.

Keep it warm.

One setting tells it to stay in memory instead of reloading. This is what makes it instant.

Wake it once.

Send a quick "hello" so it loads. Check it's warm — and you're live.

# 1. update Ollama first (North needs the latest) → https://ollama.com

# 2. pull North Mini Code (q4, ~19 GB)
ollama pull north-mini-code-1.0:q4_K_M

# 3. keep it warm so it never reloads
launchctl setenv OLLAMA_KEEP_ALIVE 30m

# 4. wake it (thinking off = instant builds)
ollama run north-mini-code-1.0:q4_K_M "build me a snake game"

# check it's warm:
ollama ps
north-mini-code-1.0 19 GB 100% GPU warm ← ~92 words/sec

Lighter Mac? gpt-oss:20b (OpenAI, ~12 GB, ~75 words/sec) is a great lighter pick with no thinking mode. Both run the exact same way.

Should you do this · what holds people back

The three things people get wrong — backwards.

Belief: "A free local model can't really code."

Truth: North Mini Code out-scores models 4× its size on a public coding benchmark — and it's free, Apache-2.0. On my Mac it built a full scientific calculator, a todo app and a pricing page from one sentence each. The "free can't code" rule just broke.

Belief: "A 30 GB-class model will be way too slow on my Mac."

Truth: Backwards. North only fires 8 of its 128 experts per word, so it runs at ~92 words a second — faster than the small models. The slow ones are the old dense models that wake their whole brain. Right shape, not small size.

Belief: "Setting up a local model is too technical for me."

Truth: Update one app, then three short commands. If you can copy and paste a line, you can run this. Ten minutes, start to finish — and then it's free forever.

Don't take my word for it

3,600+ founders are running fast local coders like this inside the Boardroom right now. Their wins — real businesses, real results — are documented here.

Read the member wins →

Recap · what you walk away with

What you gain.

You stopped paying. A free coder, Apache-2.0, runs on the Mac you own — that monthly bill is gone.

ii.

You out-punch the giants. A 30B model that beats 120B ones at code, on your laptop.

iii.

You stopped waiting. Only 8 experts fire per word — ~92 words a second, instant builds.

iv.

You stopped leaking. Nothing you type ever leaves your Mac — proven, zero connections out.

You pick speed or depth. Thinking off for instant builds, on for the tricky jobs.

vi.

You build by voice. Say it, watch a whole app appear live, and it saves itself.

Your turn

Put a free coder that beats the giants inside your Mac.

You've seen it. North Mini Code is free, fast, made for code, and runs offline. If you want it set up with you — wired into a full dashboard, step by step — it's all inside the AI Profit Boardroom.

The full Agent OS — North, Claude, GLM, Hermes and more, one dashboard
The Local AI Coding Engine setup — North, the warm-pin, the voice build surface
A 30-day roadmap, daily tutorials, and four coaching calls a week
A room of 3,600+ builders doing this every day

Get the Agent OS →

I'll see you inside ↗

Before you go →

Join 3,600+ founders building with this stack.

The AI Profit Boardroom is where the actual Agent OS lives — the templates, the prompts, the daily rooms, the weekly walkthroughs. Same builds you read about here, taught hands-on inside.

3,600+Members

258Documented wins

38Countries

Join AI Profit Boardroom →

No card on this page. Opens in a new tab.