It's the model its own author calls a hype cautionary tale — 10 training examples, three minutes of training. Ollama can't even run it. So I ran it on Apple MLX, benchmarked it, and had it build games. What came out surprised me.
I'm not repeating hype. I pulled the model, ran it locally, and scored what it built. Here are the actual sources:
Before
Every week a new "insane" local AI drops with a wild name.
I'd get excited, try to run it, and hit a wall — half of them won't even load in Ollama.
The ones that did run, I had no real way to tell if they were actually any good.
So I was either believing the hype or ignoring it. Both felt wrong.
Then I stopped guessing and built a system that runs + scores any model.
After
Now a hyped model drops and I just pull it, run it locally, and score what it builds.
If Ollama can't load it, my Agent OS Local engine runs it on Apple MLX instead.
I had this "joke" 27B build two real 3D games in an afternoon, then ranked it on my own leaderboard.
No hype. No faith. Just the receipts.
You can do this too. Same tools. Same path.
Here's what's happening for the operators already running this stack — agency owners, ecom founders, course creators, solo builders. Different businesses, same result.
I'm not going to paste invented quotes here. The wins are real and written by the members themselves — across 38 countries — so read them in their own words.
Read the 158-page wins doc →You've seen the hype cycle. A model drops, everyone calls it insane, nobody actually checks.
The next 10 minutes show you how I pull any model, run it locally for free, and score it myself.
So here's the deal. Promise yourself one thing right now: the next time a local AI goes viral, you're going to RUN it before you trust it.
The people who test get the real answer. The people who repeat the hype get burned.
Be one of the people who tests.
Commit to testing today. This changes how you pick every tool from now on.
Let's be straight about what Qwable-5-27B-Coder is, because the name oversells it.
It's Qwen3.6-27B — a strong open base model from Alibaba — with a full fine-tune on just 10 traces (5 from Fable 5, 5 from Kimi 2.7 Coder), trained in about three minutes on a single machine.
The author didn't hide this. They put it front and centre and even refused to publish benchmarks — on purpose — to make a point about AI hype. Their own words: it's "not recommended as a production coding model."
So this is basically Qwen3.6-27B wearing a flashy "Qwable Coder" jacket. That matters for one big reason: anything good it does is the base model being good — not the branding.
No — and that's the interesting part. It's an honest experiment about marketing vs substance. The 10-trace fine-tune barely changed the model (the quality drift is near zero). Which means you're really testing whether a free, open 27B base can do real work on your own Mac. Spoiler: it can — it's just dressed up in hype it doesn't need.
This is the real lesson for running local AI in 2026: for brand-new models, MLX (Apple's own framework) often works when Ollama and llama.cpp can't yet. I wired MLX straight into my Agent OS Local engine so it just works.
I run a public leaderboard, GoldieBench, where every model gets the same one-prompt build challenges and a 0–10 score for what it actually ships. So I put Qwable through it on my Mac.
For a "10-trace, 3-minute, don't-use-this" model, the scores are genuinely good — because that 27B base is no joke:
8.3 is task-winner territory — the same band the frontier cloud models land in. From a free model running on a laptop.
But there's a flip side, and I won't hide it: it's slow, and it over-thinks. It's a 27B "thinking" model, so it reasons for a long time before answering. I watched it burn 900 tokens of reasoning on a one-sentence question. Each game took about five minutes to build.
A 9B local model runs ~3× faster. Qwable trades speed for the smarts of a much bigger brain. Pick the right tool for the job.
This is the proof. No editing, no second prompt, no fixing its code. I asked once, and these came out — running, playable, on my own machine. Click any of them.
Three real things. One model. $0. Nothing left my Mac. All ranked live on GoldieBench →
Often true — and Qwable proves it cuts both ways. I asked it for an abstract particle "wormhole" and got a black screen. But ask it for a concrete scene — a dragon, a road, a car, a landing page — and the strong base shines. The lesson: give a local model a clear, recognisable target, not vibes.
Here's what this whole experiment proves, and it's the thing I keep repeating.
A "Qwable Coder" that's really just Qwen3.6-27B can build an 8/10 game. A 9B "Qwythos" can barely manage 3/10 on the same tasks. The fancy names told you nothing — only running them did.
So stop shopping for the magic model. There isn't one. The edge isn't the model you pick this week — it's the system you plug every model into: one that runs any of them, remembers your business, gives them your tools, and ships the output. That's the Agent OS.
No — that's the biggest myth about it. Agent OS runs the everyday 90% on a free local model (on your own machine, $0, nothing leaving it — exactly like Qwable here), free APIs slot in for more, and for the frontier work it drives the CLIs you already pay for — your Claude subscription already includes the Claude CLI, and Agent OS plugs straight into it, so you're not paying twice. It's a layer on top of what you already own, not a new meter. And inside the AI Profit Boardroom there are full token-optimisation tutorials, so you learn to cut usage to the bone.
Qwable is one model. The Agent OS is the operating system that runs any model — local or frontier — from one dashboard. Here's what's inside:
You're not buying a tool. You're getting the whole operating system I run a seven-figure business on.
Get the Agent OS →Wrong: "I need to find the one best AI model and stick with it."
Right: There is no one best model — they leapfrog weekly. The win is a system that swaps any of them in and tests which is best for the job.
Wrong: "If a model is hyped and has a cool name, it must be good."
Right: A "Qwable Coder" is just Qwen3.6 with a jacket on. The only way to know if a model is good is to run it on real work and score it. Names lie. Receipts don't.
Wrong: "Running powerful AI locally costs a fortune."
Right: Qwable built two 3D games on my laptop for $0. Free local models do the everyday 90%, and your existing CLIs cover the rest. The cost myth is just a myth.
158 pages of members who stopped chasing hype and started shipping — real businesses, real wins.
Read the 158-page testimonials doc →Qwable is Qwen3.6-27B with hype branding — the base does the work.
When Ollama + llama.cpp can't load a new model, Apple MLX runs it on your Mac.
Two 3D games (8.3 + 8.0) and a landing page (7.2), one prompt each, $0.
A 27B is slow (~18 tok/s) and over-thinks — great smarts, not great speed.
The model doesn't matter. The system you plug it into does.
Pull any model, run it free, score it — the Agent OS way.
You watched a free 27B build two real games on a laptop. Imagine that power wired into one system that knows your business, runs your tools, and ships the work — that's the Agent Operating System inside the AI Profit Boardroom.
You get the full Local engine (run Qwable + any local or MLX model offline), every CLI you already pay for in one dashboard, the Planner→Builder→Reviewer Kanban, the live-preview Workspace, the memory vault, token-efficiency playbooks, and 3,600+ founders building alongside you — with new tools added the week they drop.
Set it up in an afternoon. Then never chase hype again.
Get the Agent OS →