Run every model in one system — the Agent OS inside AIPB
Join Now
GLM 5.2 OPEN WEIGHTS · 1M CONTEXT · CODING-FIRST { } </>
Dropped — 13 June 2026
I. The other huge AI drop today

GLM 5.2 — the 1M-context flagship. The same day Claude got banned, China gave one away.

While the US government was switching off Claude's best models, Zhipu AI quietly dropped GLM 5.2 — a flagship coding model with a one-million-token context window, with the full open-source weights landing next week under the MIT licence. It plugs straight into Claude Code and OpenClaw. And I've already wired it into my Agent OS. Let me show you exactly what it is, what it can do, and how to switch your coding agent over to it today.

Context window 1M tokens
Entry Flat Coding Plan
Open weights MIT · next week
Works in Claude Code + OpenClaw
II. Straight from Z.ai

Here's the announcement.

Zhipu announced it on X themselves. Read it — then I'll break down what each part actually means.

Nearly a million views in a day. Three claims stand out — a 1M context, it's coding-first, and the weights go fully open under MIT next week. Let's take each one.

III. What's actually inside

The specs, in plain English.

1M
Context window
5x bigger than GLM 5.1's 200K. Model ID glm-5.2[1m].
131K
Max output
Up to 131,072 tokens in a single response.
744B
Parameters (MoE)
~40B active per token. Carried over from the GLM-5 line.
2
Thinking gears
High and Max effort. Z.ai says use Max for coding.
MIT
Open licence
Full weights, free for commercial use — landing next week.
Flat
Coding Plan
The Lite tier — no per-token charge for 5.2.

The headline is that context window. A million tokens means you can drop an entire codebase into one session and it holds the thread. That's the lane where Claude has always had the edge — large-codebase work and long, multi-hour agent runs. GLM 5.2 is aiming straight at it.

One honest flag: Z.ai calls the 1M context "usable" — a careful word. Retrieval quality across the whole window hasn't been independently tested yet. Treat it as promising, not proven.

The pricing tiers (one "prompt" ≈ 15–20 model calls)
  • Lite — the entry tier, ~400 prompts/week.
  • Pro — ~2,000 prompts/week.
  • Max — ~8,000 prompts/week.
  • Team — seat-based for organisations.

A frontier-class coding model with 1M context on a flat plan makes per-token API spend look steep. That's the whole point.

IV. The twist nobody's talking about

The company behind it is blacklisted in America.irony

Here's the part that makes today surreal.

Zhipu AI — the lab behind GLM — was added to the US Entity List back in January 2025.

It was the first Chinese large-model company ever blacklisted by the US government.

And it didn't get the soft version. It got a "Footnote 4" designation — the harshest tier, which blocks not just US tech but any foreign product built with US parts.

So sit with the timing.

On the same day the US government switched off Claude's best models for national security, a Chinese company that America has officially blacklisted handed those same Americans a free, frontier-class coding model.

The thing you were told was too dangerous to export — and the thing built by a company you're not allowed to trade with — landed on the same Tuesday.

Who Zhipu actually is
  • Founded in 2019 by two Tsinghua University professors, Tang Jie and Li Juanzi — it's a serious research lab, not a startup chasing hype.
  • Backed by everyone. Alibaba, Tencent, Ant Group, Xiaomi, Meituan — and even Saudi Aramco's investment arm.
  • It went public in January 2026 on the Hong Kong exchange at roughly a $6.6 billion valuation. The stock jumped about 173% in a month.
  • JPMorgan rates it "Overweight." This is not a fringe project — it's one of China's flagship AI bets.

That's the real story under the model. The US tried to wall this company off. The company answered by giving its best work away for free, under the most permissive licence there is. Whatever you think of that — it's a strategy, and it's working.

V. Under the hood

Why it can offer 1M context this cheap.

You don't need to be an engineer for this — but two pieces of tech explain why it runs so efficiently.

1. It only pays attention to what matters

Normal models make every word look at every other word. That's what makes long context expensive — costs explode as the text grows.

GLM uses something called DeepSeek Sparse Attention. Instead of everything looking at everything, the model dynamically focuses on the tokens that actually matter and skips the rest.

That's the trick that makes a one-million-token window affordable instead of bankrupting. Same idea as skim-reading a book for the important bits instead of re-reading every page.

2. It trains itself faster than anyone

Zhipu built a training system they call Slime. It splits the "generate practice problems" step from the "learn from them" step so both run at full speed without waiting on each other.

That's the boring-but-huge reason GLM went from 5 to 5.1 to 5.2 in a matter of months. They can retrain and ship faster than almost anyone.

The scale jump, in numbers
  • GLM-4.5: 355B parameters, 32B active per token.
  • GLM-5 → 5.2: 744B parameters, 40B active — more than double the size.
  • Training data: jumped from 23 trillion to 28.5 trillion tokens.
  • The mission, in their own words: the GLM-5 paper is literally titled "From Vibe Coding to Agentic Engineering."

Translation: this line was built specifically for long, autonomous coding work — not chat. That's the lane it's gunning for.

VI. The catch you should know

They shipped it with zero benchmarks.

Here's the part to be honest about. At launch, Zhipu published no benchmark numbers. No SWE-bench, no LiveCodeBench, no HumanEval. Nothing.

The X replies noticed instantly. Some read the silence as a red flag — "why hide the numbers if they're good?" Others who actually tried it came away impressed.

What people are saying (first 24 hours)
  • "On par with GPT-5.5 high." — one developer who tested it on real coding tasks.
  • "The lack of benchmarks makes me think it's below its competitors." — a fair skeptic.
  • "It should be open — yet only on the Coding Plan?" — the open-vs-paywalled tension.
  • "No multimodal." — text only for now. If you need vision, this isn't it yet.

For the lineage, though, we have real, independently-checked numbers — and they're strong.

What the GLM-5 family already proved
  • GLM-5.1 hit state-of-the-art on SWE-bench Pro — 58.4, beating GPT-5.4, Claude Opus 4.6 AND Gemini 3.1 Pro. That's the headline result.
  • GLM-5 scored 77.8% on SWE-bench Verified — neck and neck with GPT-5.2 (80.0) and Claude Opus 4.5 (80.9).
  • It beat GPT-5.2 on multilingual coding — 73.3 vs 72.0, and crushed Gemini 3.0 Pro (65.0).
  • It ran for 8 hours straight — roughly 1,700 autonomous agent steps in a single session, looping plan → execute → test → fix without a human.

So the honest read: GLM is genuinely top-tier on coding and agent work, a touch behind on the very hardest pure reasoning, and unbeatable on price. GLM 5.2 is that same family with five times the context and a deeper thinking mode. Wait a week for independent 5.2 numbers before you trust the hype — but the bloodline is already proven.

VII. The number that matters most

The efficiency gap is not even close.

Strip away the hype and this is the real reason people are switching.

7.8x
Cheaper
GLM-5 is far more token-efficient than Claude Opus for the same coding work.
94.6%
Of Opus' score
GLM-5.1 already reached 94.6% of Claude Opus 4.6's coding performance (45.3 vs 47.9).
Flat
monthly plan
GLM Pro is one flat monthly price vs a much pricier comparable Max plan.

So you're getting ~95% of the best model on the planet, for a fraction of the effort.

And here's the kicker — switching is basically free. You don't change tools. You change one endpoint and keep working in Claude Code exactly like before.

For a long time, Claude's one untouchable edge was native 1M context. That was the reason to pay up. GLM 5.2 just closed that gap too.

That doesn't make Claude bad. Claude still wins the hardest reasoning, and it has multimodal that GLM doesn't. But it does mean the "just pay for the best, it's worth it" argument got a lot weaker this week.

VIII. It's not even the only one

GLM 5.2 landed in a crowded field.

This is the bit most coverage misses. GLM 5.2 isn't alone. There's a whole wave of Chinese open models trading blows at the top of the coding charts right now.

The open-source coding leaderboard (2026)
  • DeepSeek V4 Pro — tops the charts (~87). Raw algorithmic muscle: best LiveCodeBench (93.5) and Codeforces (3206) of any model, open or closed.
  • GLM-5.1 — right behind (~83). The best all-rounder for long-horizon agentic engineering, plus the MIT licence.
  • Kimi K2.6 — (~81). Built for sub-agent parallelism — shines when you run many agents in a harness at once.
  • Qwen 3.6 Plus — (~79). Also ships a 1M context window — GLM's main rival on long-context.

Notice what every single comparison site concludes.

"The right answer is not a single winner. It is a routing strategy. Each of these models is genuinely the best choice for specific workflows."

— the consensus across every 2026 coding-model roundup

Read that twice. The experts aren't telling you to pick one model. They're telling you to run several and send each job to whichever one is best for it.

DeepSeek for hard algorithms. Kimi for parallel agents. GLM for long agentic builds. Claude for the trickiest reasoning. Whatever's cheapest for the simple stuff.

That's not a model decision. That's a system decision. Which is exactly where this is going.

IX. The part you actually want

How to switch your agent to GLM 5.2 today.

This is the bit that matters. You don't need a new tool. GLM 5.2 plugs into the agents you already run — Claude Code, OpenClaw, Cline, OpenCode, Roo Code, Goose, Crush, Kilo Code. Here's the exact setup.

Step 1 — get on the Coding Plan

Grab a GLM Coding Plan key from z.ai (any tier works). The model runs on the coding endpoint — not the pay-per-token one — so the subscription covers it:

endpoint
https://api.z.ai/api/coding/paas/v4

Step 2 — point Claude Code at it

Open ~/.claude/settings.json and set GLM 5.2 as your Opus and Sonnet models. The [1m] suffix turns on the full million-token context:

~/.claude/settings.json
{
  "env": {
    "CLAUDE_CODE_AUTO_COMPACT_WINDOW": "1000000",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL":  "glm-4.5-air",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "glm-5.2[1m]",
    "ANTHROPIC_DEFAULT_OPUS_MODEL":   "glm-5.2[1m]"
  }
}

Step 3 — turn the thinking up

Inside a session, type /effort and pick max. For coding, max effort is where GLM 5.2 does its best, most stable work:

claude code
# low / medium / high  → maps to GLM "high"
# xhigh / max          → maps to GLM "max"  (use this for code)
/effort max
/status   # confirm it now shows glm-5.2[1m]

On OpenClaw it's the same idea — add the glm-5.2 model to your zai provider and set it as primary with a fallback. That's it. New brain, same hands.

And if Claude Code says the [1m] model "doesn't exist" — just update Claude Code to the latest version and try again. That's the one gotcha.

X. What I did with it

I already wired GLM 5.2 into my Agent OS.live

This is exactly why I built the system the way I did.

When GLM 5.2 dropped, I didn't have to rebuild anything.

I added one provider profile in Hermes pointed at the coding endpoint.

I dropped a GLM 5.2 workspace bucket into the Agent OS dashboard — right next to my Kimi and N2 ones.

And within minutes I had GLM 5.2 building real apps in its own bucket.

That's the whole game.

While Claude was getting banned and everyone panicked, a new top-tier model showed up — and my system just absorbed it.

New model on Monday, another on Friday — it doesn't matter. The Agent OS treats every model as a swappable part.

You can build the exact same thing. Same tools. Same path.

3,100Founders in AIPB
163kX followers
38Countries · members
158Pages of member wins

"The biggest unlock wasn't one tool — it was having a system that absorbs every new model the week it drops."

— theme from members inside the Boardroom
XI. Why this keeps happening

One model gets banned. Another gets given away. Same week.

That's the new normal. The model layer changes weekly now. The people who win aren't chasing each drop — they're running a system that takes the best of all of them.

The old way
one model
  • Pick one AI, wire everything to it
  • A better model drops — you can't use it without a rebuild
  • Your model gets banned or gated — you're stuck
  • You pay top dollar for one provider's API
  • Every drop is FOMO and stress
  • Result: always one step behind
The new way
one system
  • Run many models through one dashboard
  • GLM 5.2 drops — you add one profile and it's live
  • A model gets banned — the system routes around it
  • Use the cheapest model that's good enough per job
  • Every drop is a free upgrade, not a rebuild
  • Result: always on the best model, for the lowest cost
Claude, OpenClaw, Hermes and GLM connected into one Agent Operating System
The Agent Operating System

Drop in any model the week it launches — GLM 5.2 included.

This is the exact system I used to add GLM 5.2 in minutes. The Agent Operating System connects Claude, OpenClaw, Hermes — and now GLM — into one dashboard with one shared memory.

When a new model drops, you add one profile and it's live across your whole setup. When one gets banned, the system just routes to another. You're never rebuilding and never stuck.

What you get when you join
  • The full Agent OS zip — every prompt, every config, ready to install
  • The exact GLM, Kimi and Claude wiring I use, step by step
  • The Obsidian memory setup so your AI knows your business cold
  • Coaching calls every week where I walk you through it
  • A 30-day roadmap to get the whole system running
  • 3,100 founders building alongside you — someone's online 24/7
Get the Agent OS → link in the description
XII. Read it yourself

Every source, first-hand.

XIII. GLM 5.2 in 10 tiles

The whole thing — at a glance.

i.

It dropped today

13 June 2026 — same day the US banned Claude Mythos and Fable 5. Wild timing.

ii.

Made by a blacklisted lab

Zhipu has been on the US Entity List since Jan 2025 — yet just gave Americans a free model.

iii.

1M context

Five times bigger than GLM 5.1. Drop a whole codebase in one session.

iv.

A flat plan

Frontier-class coding without a per-token meter.

v.

7.8x cheaper

~95% of Claude Opus' coding score for a fraction of the cost. Zero migration.

vi.

Open under MIT

Full weights, free for commercial use, landing next week.

vii.

Proven bloodline

GLM-5.1 hit SOTA on SWE-bench Pro, beating GPT, Claude and Gemini. Ran 8 hours solo.

viii.

Plugs into your agent

Claude Code, OpenClaw, Cline and more. One config change to switch.

ix.

Not the only one

DeepSeek, Kimi and Qwen are all fighting at the top. The answer is routing, not picking.

x.

The lesson

Models churn weekly. Own the system that absorbs every one of them.

A new model drops every week now. Your system should eat them for breakfast.
That's the whole reason I built the Agent OS.

Want to add any model in minutes?

GLM 5.2 today. Something better next week. The only way to keep up is a system that absorbs new models instead of forcing a rebuild.

Grab the Agent Operating System inside the AI Profit Boardroom. It connects Claude, OpenClaw, Hermes and GLM into one system with shared memory and one dashboard you control. New model drops? Add a profile, and it's live everywhere. I'll show you the exact wiring I use.

  • The full Agent OS zip — every prompt and config
  • The GLM + Kimi + Claude setup I run, step by step
  • The Obsidian memory setup so your AI knows your business
  • Weekly coaching calls — we set it up together
  • 3,100 founders already building this way

See the 158 pages of member wins →

Get the Agent OS →
Inside the AI Profit Boardroom · aiprofitboardroom.com

Built for operators · used in 38 countries