New model · North Mini Code just dropped The framework · North Mini Code on your Mac

The Local AI Coding Engine.

Cohere just dropped North Mini Code — a free coding model that out-scores models 4× its size. And it runs 100% on your own Mac.

CODING SCORE · BY MODEL SIZE the little 30B model out-scores models four times its size. (Artificial Analysis Coding Index) North Mini Code 30B · free 👑 33.4 👑 Devstral 2 123B · 4× bigger Mistral Small 4 119B · 4× bigger Nemotron 3 Super 120B · 4× bigger A 30B model on your laptop beats 120B models on big servers.
North Mini Code · tops its class AND out-scores models 4× larger on coding — and you run it free, on your Mac.
Straight from Cohere · official sources
Read it + run it yourself ↓
"North Mini Code scores 33.4 on Artificial Analysis' Coding Index, outperforming similarly sized open models… as well as substantially larger models including Nemotron 3 Super (120B), Mistral Small 4 (119B), and Devstral 2 (123B)."— Cohere Labs, North Mini Code release, June 2026
3,600+Founders inside AIPB
400kYouTube subscribers
163kX / Twitter followers
38Countries · live members

For months I paid for a coding AI every single day.

I tried free local coders, but they were slow or made junk.

So I figured good coding AI just had to be the big paid kind.

Then Cohere dropped North Mini Code — free, made only for coding.

I pulled it onto my Mac and asked it to build me a few apps.

It answered at 92 words a second — faster than anything I'd run.

And it beats models four times its size at writing code.

Now there's a free coding engine living inside my Mac.

It builds tools, pages and games for me — offline, no bill.

I just say what I want, and watch it appear. That changed everything.

The framework · the Local AI Coding Engine

The Local AI Coding Engine™.

A free coding model that lives inside your own Mac — fast, private, made for building. Five things make it work.

i.

A Coder, Not A Chatbot

North Mini Code is trained for one job: writing software. Not a general chatbot dabbling in code — a model built to build.

ii.

Small Body, Big Punch

30B in size but only a few experts fire per word — so it's quick on your Mac, yet out-scores models 4× larger at code.

iii.

Thinking On Demand

Turn its reasoning on for tricky jobs, or off for instant builds. You pick speed or depth, per task.

iv.

Free + Yours

Apache-2.0 licensed and offline. No bill, no tokens, no limits — and nothing you type ever leaves the machine.

v.

Build By Voice

Wired into your dashboard, you just say "build me a calculator" — it writes it, previews it live, and saves it.

Old way vs new way

Paying for a coder vs owning one.

Here's the shift. Most people rent a coding AI by the month and send every line to a server. North flips that.

Old way · renting a cloud coder
$20–200/mo
  • Pay every month for a coding assistant
  • Every line of code leaves your machine
  • Hit limits mid-build, wait for the reset
  • Tried free local coders — slow, or wrote junk
  • Decided you need the big paid model
  • Result: you rent your coder and hand over your code
New way · North Mini Code on your Mac
$0/mo
  • A free coder built only for software
  • Beats models 4× its size at writing code
  • 92 words a second on a normal Mac — no limits
  • Nothing you type ever leaves the machine
  • Say the word and a whole app appears, live
  • Result: a fast, private coding engine you own
Why a 30B model is quick

It only wakes a few experts per word.

Here's the trick that makes North fast on a laptop.

Old models wake their whole brain for every single word. Big means slow.

North is split into 128 experts, and only 8 of them fire for each word.

Giant brain, tiny effort. That's how a 30B model runs as quick as a small one.

128 EXPERTS · ONLY 8 FIRE PER WORD
128 experts in the model. Only the 8 best for each word light up. Big brain, small effort, fast answers.

On my Mac: North runs at ~92 words a second and takes about 19 GB of memory — light enough to leave plenty of room on a 36 GB machine. Compare that to a dense 28 GB model that crawled at 16. Right shape beats big size.

What I built with it · honest

I gave it 16 jobs. Here's the truth.

I asked North to build 16 things in a row — games, tools, pages, visuals.

It cranked them out fast, about 20–40 seconds each.

The tools and pages were genuinely good — a full scientific calculator, a todo app, a pricing page, a drum machine.

The trickier games and animations were hit-or-miss — some played, some needed a fix.

That's the honest picture: no local model one-shots everything. North nails the well-defined builds and gets you 80% there on the hard ones — fast, free, and on your machine.

you › build me a scientific calculator
north › Here's a self-contained calculator ↓
<!DOCTYPE html> … keypad · operators · sin/cos/tan/log/√/π · live display
→ previewed live · saved to your workspace · ~38s
Fast + private · the two settings

Keep it warm. Keep it home.

Two small things make North feel instant and stay private.

First: keep it warm in memory, so it never reloads. The lag people blame on "slow models" is just the cold start.

Second: it runs on a loop inside your Mac and never phones home. I checked — while it built a whole app, not one byte left the machine.

# warm + listening only on your own machine
$ ollama ps
north-mini-code-1.0 19 GB 100% GPU warm
$ lsof -iTCP -c ollama | grep -v 127.0.0.1
(nothing) ← zero connections out. fully offline.
Set it up with us

Want North wired into your dashboard?

I put North Mini Code inside the Agent OS — voice in, builds previewing live, every app saved in one workspace. It sits next to my cloud agents as the free, fast coder.

Get the Agent OS →

link in the description ↗

Do it yourself · the setup

Run North in about ten minutes.

North needs the latest Ollama (it's a brand-new model). Update first, then four steps.

Update Ollama.

North is new, so grab the latest from ollama.com. Older versions can't run it yet.

Pull North — once.

The q4 version is about 19 GB. Fits a 16–36 GB Mac. After this it's yours, offline.

Keep it warm.

One setting tells it to stay in memory instead of reloading. This is what makes it instant.

Wake it once.

Send a quick "hello" so it loads. Check it's warm — and you're live.

# 1. update Ollama first (North needs the latest) → https://ollama.com

# 2. pull North Mini Code (q4, ~19 GB)
ollama pull north-mini-code-1.0:q4_K_M

# 3. keep it warm so it never reloads
launchctl setenv OLLAMA_KEEP_ALIVE 30m

# 4. wake it (thinking off = instant builds)
ollama run north-mini-code-1.0:q4_K_M "build me a snake game"

# check it's warm:
ollama ps
north-mini-code-1.0 19 GB 100% GPU warm ← ~92 words/sec

Lighter Mac? gpt-oss:20b (OpenAI, ~12 GB, ~75 words/sec) is a great lighter pick with no thinking mode. Both run the exact same way.

Should you do this · what holds people back

The three things people get wrong — backwards.

Belief: "A free local model can't really code."

Truth: North Mini Code out-scores models 4× its size on a public coding benchmark — and it's free, Apache-2.0. On my Mac it built a full scientific calculator, a todo app and a pricing page from one sentence each. The "free can't code" rule just broke.

Belief: "A 30 GB-class model will be way too slow on my Mac."

Truth: Backwards. North only fires 8 of its 128 experts per word, so it runs at ~92 words a second — faster than the small models. The slow ones are the old dense models that wake their whole brain. Right shape, not small size.

Belief: "Setting up a local model is too technical for me."

Truth: Update one app, then three short commands. If you can copy and paste a line, you can run this. Ten minutes, start to finish — and then it's free forever.

Don't take my word for it

3,600+ founders are running fast local coders like this inside the Boardroom right now. Their wins — real businesses, real results — are documented here.

Read the member wins →
Recap · what you walk away with

What you gain.

i.

You stopped paying. A free coder, Apache-2.0, runs on the Mac you own — that monthly bill is gone.

ii.

You out-punch the giants. A 30B model that beats 120B ones at code, on your laptop.

iii.

You stopped waiting. Only 8 experts fire per word — ~92 words a second, instant builds.

iv.

You stopped leaking. Nothing you type ever leaves your Mac — proven, zero connections out.

v.

You pick speed or depth. Thinking off for instant builds, on for the tricky jobs.

vi.

You build by voice. Say it, watch a whole app appear live, and it saves itself.

Your turn

Put a free coder that beats the giants inside your Mac.

You've seen it. North Mini Code is free, fast, made for code, and runs offline. If you want it set up with you — wired into a full dashboard, step by step — it's all inside the AI Profit Boardroom.

Get the Agent OS →

I'll see you inside ↗