The full Agent Planning Engine setup lives inside the AI Profit Boardroom
Join
Fable 5 got banned · its intelligence didn't
I.The Agent Planning Engine · proven by Kilo

Get Fable 5's intelligence. Without Fable 5.

A single radiant golden master blueprint glowing on a pedestal, a brilliant orb of intelligence pouring light onto it, and many identical glowing structures and gears rising and assembling from the one plan
0%
cheaper than Fable
for the same build
0
of 15 checks —
identical output
$0
the whole genius
premium · pay once

Fable 5 was the most powerful AI ever made public — until the US government had it pulled worldwide, three days after launch. But Kilo proved something first: hand a cheap model a good plan, and it builds the exact same thing Fable 5 would — identical, for 59% less. Here's the proof, then the system.

II.The proof — straight from Kilo

Kilo proved it: same brain, cheap.

This isn't my theory — the Kilo team ran the test and published it. They had two frontier models plan the same service, then made both build the winning plan from identical starts. Fable 5 — the most powerful model ever made public, 80% on SWE-bench Pro — and a cheap model produced services that were identical, down to which individual users a 35% rollout enabled, both passing all 15 checks. The only thing that changed was the bill.

"Planning with Claude Fable 5 and implementing with GPT-5.5 produced the same service for 59% less than using Claude Fable 5 for both phases… the model gap showed up in planning, and once that judgment was written down as a plan, execution stopped depending on the model."

— Darko & Job Rietbergen · Kilo Blog · 13 June 2026

Cost to BUILD the same plan
Same plan in, identical service out. The cheap model built it for 62% less. Hover a bar.
Fable 5 builds it too$16.66
A cheap model builds it$6.30
Identical output, 62% less to build. Across the full plan-and-build pipeline that's 59% cheaper — a 2.4× gap that scales to roughly $10,800 a year more at 20 tasks a week. The checks couldn't tell the two services apart.
Plan quality — where the intelligence actually lives
Kilo's rubric, out of 10. Axis starts at 7 to show the gap. The model gap shows up in the plan, not the build.
10
9
8
7
9.1
Fable 5's
plan
sharper
8.3
Cheaper
plan
left forks open
Fable 5's plan won on judgment, not size — 431 tight lines that decided every fork, vs 1,456 that left calls open. The planning premium was just $0.49. So you buy Fable-level smarts in the plan for pennies — then build it cheap.
And the twist that makes this matter: three days after launch, Anthropic had to disable Fable 5 worldwide under a US government directive. The most powerful model vanished overnight — but its level of output didn't, because the intelligence was already saved in a plan a cheap model could build.
My story · why this matters

I was paying genius prices for typing.

Before

I used to point every job at the most expensive model.

Plan it with the best. Build it with the best. Fix it with the best.

The bill was brutal — and most of those tokens were just the model typing out boring code.

Then my "best model" got banned overnight, and half my setup stopped working.

Then I split the job in two.

After

Now the genius only writes the plan — the one part where judgment actually matters.

A cheap model does the building, and the output comes out the same.

My costs dropped by more than half for work I can't tell apart.

And when a model gets banned or rate-limited, I swap the cheap builder and keep moving.

You can have this too. One plan. A cheap army. Same result.

III.Who's telling you this

I run real builds on this stack every day.

I'm not guessing at AI costs from the sidelines. I run a 7-figure agency, a daily YouTube channel, and a members' community where people build with these tools for real work — agencies, ecom, coaching, content. Different businesses, same lesson: the operators who split planning from building pull ahead.

3,200+founders in AIPB
70+agency team
38countries · live
$100k+/moAIPB MRR
Before you scroll on —

Commit to transitioning today. Not tomorrow.

You've seen the number. Same service, 59% less. Measured, not hyped.

The next ten minutes show exactly how the split works and how to run it.

So here's the deal.

Promise yourself one thing right now — before you sleep tonight, you'll run one job the new way. Plan it with the smart model. Build it with a cheap one. Just once. Because the moment you stop paying genius prices for typing, your whole AI bill changes shape.

The people sitting still are burning money on tokens that don't need a genius. The people splitting it today are the ones who'll scale without the bill scaling with them.

Be one of those people.

Commit to the transition. Split one job tonight.

IV.The framework

The Agent Planning Engine™.

One smart agent writes the plan. Cheap agents build it. The expensive thinking happens once, gets written down, and then it's just labour — and labour should be cheap. Three parts.

i.

The Planner plans

Your smartest model writes the plan and decides every fork — the hard call, the edge case, the exact math. You pay frontier price for this one short job, because judgment is the only place it's worth it.

ii.

The Plan the plan.md

The plan is the artefact that carries the genius. It pins the hard parts so nothing drifts, and it's a reusable asset — write it once, run it forever, hand it to any model.

iii.

The Builders builds

Cheap models do the typing — as many as you want, in parallel. Because the plan already made every decision, a $5 model builds the same thing a $50 one would. Swap any of them in a click.

✦ the part that makes it work
  • The plan removes the guessing. Kilo found the model gap shows up in planning. Once the plan decides everything, "guessing is where implementations diverge" — and the cheap model has nothing left to guess.
  • It's ban-proof. Fable 5 got pulled overnight. Your plan didn't care — hand it to the next available model.
  • It compounds. Every good plan is a reusable template your whole team (or 2,200 members) can build from cheaply.
V.How the split works

One plan in. Identical builds out.

This is the surprising part Kilo proved. They gave the same plan to two different models. Both passed all 15 acceptance checks — and the two services were identical, down to which individual users a 35% rollout enabled. Same plan, same output, regardless of who built it.

PLANNER the genius PLAN decides all cheap cheap cheap the agents build ↓ identical

The plan pins the maths, so every builder lands on the same answer. Cheaper hands, same output.

Why this works — in five pictures

The split sounds too good — pay less, get the same thing. So here's exactly what's happening under the hood, one picture at a time.

1 · Why the model stops mattering

A vague plan leaves forks open — so each model guesses, and two models build two different apps. A pinned plan decides every fork. There's nothing left to guess, so any model, cheap or frontier, lands on the exact same build.

VAGUE PLAN forks left open anymodel app A app B ✕ differs PINNED PLAN decides every fork anymodel same app ✓ identical

Vague plan → it depends who builds it. Pinned plan → identical, every time.

2 · The three moves, in Kilo

It's one tool and three moves. The smart model writes the plan. You switch the model in a click. The cheap model builds it — then Kilo's review agents check it for bugs, security and logic before it ships.

PLAN genius switch model ⇄ BUILDcheap sec logic perf review agents shipped

Plan with the genius · switch · build cheap · auto-review · ship.

3 · Two pipelines, two prices

Same job, two ways to run it. Use the genius for both phases and you pay frontier price for the typing too. Use the genius only for the plan, and the bill drops by more than half — for output you can't tell apart.

OLD WAY · GENIUS FOR EVERYTHING plan buildgenius = $16.66 NEW WAY · AGENT PLANNING ENGINE plan buildcheap = $10.36 59% less ↓

Same service either way. One costs $16.66, the other $10.36 — a gap that scales to ~$10,800/yr.

4 · Ban-proof by design

Fable 5 was the best model — then it was switched off overnight. If your whole pipeline lived in one model, you'd be stuck. Here the plan is saved on disk. Swap the cheap builder for any model that's up, and keep shipping.

THE PLAN saved on disk model A banned model B keeps shipping ✓

The plan survives. Swap the builder. You're never hostage to one model again.

5 · One plan, many builds

A good plan is a reusable asset. Write it once, then fan it out to as many cheap models as you want — building variations in parallel while you sleep. The expensive thinking is already done; the builds are nearly free.

ONE PLAN reusable build build build build build …in parallel, while you sleep

One plan in. Many cheap builds out — on repeat.

It's three commands in Kilo

Kilo Code runs both phases and lets you switch the model between them. The Planner writes the plan, you swap to a cheap model, the Builders build:

the Agent Planning Engine, in Kilo
# 1. The Planner writes the plan (pay genius once)
kilo run -m anthropic/claude-opus-4.8 --variant high --auto \
  "…write a very detailed plan in plan.md"

# 2. The Builders build it (nearly free) — one click swaps the model
kilo run -m moonshotai/kimi-k2.7-code --auto \
  "Read plan.md and implement it exactly. Run the tests."

# 3. See the cost split
kilo stats
Thinking it?"Won't the cheap model mess up the build?"

Not when the plan made every decision. Kilo's point: guessing is where builds go wrong — and a complete plan leaves nothing to guess.

Both models built the exact same plan and passed all 15 checks. The cheap one's only "cost" was being slightly less chatty, not less correct.

VI.I ran it on my own stack

One plan. Zero dollars.

I didn't want to just quote Kilo's run. So I wired Kilo into my Agent OS — one login unlocks 325 models — and ran the Agent Planning Engine on the post's core trap (a sticky percentage-rollout engine) using only free models. Here's what the Planner produced, for nothing.

The Planner's plan — real output, free model, $0.00
  • It picked an exact algorithm — FNV-1a 32-bit, hashing flagKey:env:userId into 10,000 buckets — and argued why.
  • It ran the code itself to compute and pin the exact bucket values: 9352, 7788, 8796 — so any drift in the maths fails the tests loudly.
  • It laid out the file structure, the enable condition, and every test (stickiness, monotonicity, distribution) — 249 lines, every fork decided.
  • Total cost on Kilo's free tier: $0.00.

That's the genius half — a complete, decision-making plan — for free. Hand that to any cheap builder and the build is just typing. The whole point of the post, reproduced on my own desk.

Thinking it?"Free models always rate-limit me — this won't run."

One Kilo login unlocked 325 models for me, including free ones strong enough to write that plan. And the paid genius (Opus 4.8) is right there the day you want it.

The whole point is you stop needing the expensive model for everything — so the cheap and free tiers carry most of the work.

VII.Who actually builds it

Meet the Builders — the cheap frontier.

When Fable 5 got pulled, the frontier didn't go dark — it got wider. Kilo lists over 500 models, and a handful of them build a sharp plan beautifully for pennies. This is the bench you pick your builder from. Every one is available right now, and three of the four are open-weight — so no government directive can switch them off.

"The Fable 5 situation is a reminder that 'most capable model available' is not the same thing as 'most capable model you can rely on in production.' Reliability isn't a fallback plan, it's a feature."

— Ari Messer · Kilo Blog · 15 June 2026

What the builders cost — $ per million output tokens
Lower is cheaper. The banned Fable 5 was the priciest of the lot. These build the same plan for a fraction. Hover a bar.
GPT-5.5 · the reliable premium hand$30.00
Kimi K2.7 Code · the everyday builder$3.50
MiniMax M3 · ~40× cheaper than Fable$1.20
Nemotron 3 Ultra · open-weight, freeFree
All four build a sharp plan well. MiniMax M3 runs about 40× cheaper than the Fable 5 that just got pulled — and Nemotron 3 Ultra is free and self-hostable, so a directive can't take it away.

The roster — pick your hands

Real specs, real prices (input $ per million tokens), straight from Kilo. Three are open-weight; one is free.

free

Nemotron 3 Ultra

NVIDIA · 550B MoE · 1M context · open-weight, self-hostable · 91% PinchBench. The free frontier — and the model my Planner used to write the live plan above.

$0.30/M

MiniMax M3

59% SWE-Bench Pro · open weights · roughly 1/40th the cost of Fable 5. The cost-efficiency king for high-volume building.

$0.75/M

Kimi K2.7 Code

1T MoE · 256K context · open MIT · 30% fewer reasoning tokens · 81.1 MCP score. The everyday builder — what the guide's commands use.

$5/M

GPT-5.5

#1 on KiloBench (74.2%) · 82.7% Terminal-Bench · similar execution to Fable 5. The reliable premium hand when the build really matters.

Thinking it?"Two of those are Chinese labs — what about my data?"

Fair — and it's why open weights matter. Kimi, MiniMax and Nemotron all publish their weights, so you can self-host and never send a byte off your own infrastructure.

Kilo's gateway also routes the same models through multiple providers (Fireworks, Morph, Parasail) — pick the one that fits your rules in org settings.

The Agent Operating System · AI Profit Boardroom

Want the Agent Planning Engine wired and ready?

You've seen the split — the Planner plans, the Builders build, Kilo switches between them. Inside the AI Profit Boardroom you get the whole thing set up in your Agent Operating System: Kilo wired in next to Claude, GLM, Kimi and Fusion, with the plan-and-build pipeline ready to run and a growing library of golden plans to build from.

What you get when you join
  • The full Agent OS — Kilo + every model in one dashboard, done for you
  • The Agent Planning Engine pipeline + the exact commands from this guide
  • A library of golden plans you can build from for pennies
  • Four live coaching calls a week + 3,200+ builders — someone's online 24/7
Get the Agent OS →link in the description
VIII.The shift

The old way vs the engine.

Same work — one way pays a genius to type, the other pays a genius to think and lets cheap hands do the rest.

The old way
~ $16.66 / task
  • Point every job at the most expensive model
  • Pay frontier price for planning AND the mechanical typing
  • One model gets banned or rate-limited — your pipeline breaks
  • Costs scale 1:1 with how much you build
  • No reusable artefact — you re-pay full price every time
  • ~$10,800/yr more at 20 tasks a week
The new way · Agent Planning Engine
~ $10.36 / task
  • Genius plans once — a $0.49 premium on a sub-dollar job
  • A cheap model builds it, identical output
  • A model dies? Swap the cheap builder, keep moving
  • 59% less for the same service, checked 15/15
  • The plan is a reusable asset — build from it forever
  • Costs stay flat while you scale the building
IX.What's holding you back

Three things stopping you.

The best model should do everything — that's what I'm paying for.

The best model is worth it for judgment, not typing. Kilo proved a cheap model builds the same plan identically — the premium on the build buys you nothing but a bigger bill.

I need to wait until one model clearly wins, then standardise on it.

There is no permanent winner — Fable 5 was the best, then it was banned overnight. Save the intelligence as a plan and you're never hostage to one model again.

This split sounds like a developer thing — too technical for me.

It's two prompts and a model switch in Kilo. If you can ask for a plan, then ask to build it, you can run it. Members who'd never opened a terminal are doing it.

Don't take my word for it

158 pages of members who already stopped overpaying for AI and started building the smart way — real businesses, real results, documented.

Read the 158-page testimonials doc →
X.The whole thing in 6 tiles

Recap — what you now have.

i.

You stopped overpaying

Genius prices only where judgment lives — the plan. The build runs cheap.

ii.

You kept the quality

Same plan, identical output — 15/15 checks, down to individual users.

iii.

You got ban-proof

A model dies, you swap the cheap builder. The plan doesn't care who builds it.

iv.

You bank the plans

Every plan is a reusable asset — build from it forever, for pennies.

v.

You run it in one tool

Kilo plans, switches model, builds, and shows the cost — all in three commands.

vi.

You proved it free

The Planner wrote a real, pinned-hash plan on Kilo's free tier — $0.00.

Pay for the plan. Build it for free.
One smart plan. A cheap build. The same result, for a fraction of the cost.

Want the whole engine, wired and ready?

Kilo + every model in one Agent Operating System — the Planner, the Builders, and the plan-and-build pipeline set up with you, step by step.

  • The full Agent OS — Kilo, Claude, GLM, Kimi and Fusion in one dashboard
  • The Agent Planning Engine pipeline + the exact commands and model picks
  • A library of golden plans to build from for pennies
  • Weekly coaching calls — we wire it up together
  • 3,200+ founders already building this way

See the 158 pages of member wins →

Get the Agent OS →
Inside the AI Profit Boardroom · aiprofitboardroom.com

Built for operators · used in 38 countries