All posts
Trending Tech

GLM-5.2 vs Claude Opus: open weights win on cost, lose on speed

Manaal Khan23 June 2026 at 12:02 am6 min read
GLM-5.2 vs Claude Opus: open weights win on cost, lose on speed

Key Takeaways

GLM-5.2 vs Claude Opus: open weights win on cost, lose on speed
Source: Hacker News: Best
  • Opus built a cleaner 3D game in 33 minutes; GLM-5.2 took 70 minutes but cost $5.39 versus $22
  • GLM-5.2 is text-only with no vision, so workflows using screenshots still need a multimodal model
  • Open weights mean GLM-5.2 can't be deprecated or restricted, a real consideration after recent model retirements

Z.ai released GLM-5.2, and the internet immediately started arguing about whether it matches Claude Opus. TechStackups ran a direct comparison: same prompt, same assets, build a 3D platformer in raw WebGL. Opus finished in half the time and produced cleaner code. GLM-5.2 cost a quarter of the price. Neither result is surprising, but the specifics matter.

The test asked each model to write a browser game from scratch. No Three.js, no engine. The model had to parse GLB files, write GLSL shaders, handle skeletal animation, collision detection, a follow camera. This is the kind of multi-file, multi-step build that exposes whether a model can hold context over a long run.

How did GLM-5.2 compare to Opus on raw numbers?

MetricGLM-5.2Claude Opus 4.8
Build time1h 10m 40s33m 30s
Output tokens131,000216,809
Tool calls128153
Cost$5.39~$21.92

Opus shipped faster despite generating more tokens. The TechStackups team attributed this to Opus needing fewer corrections: it got things right the first time more often, so the total run was shorter even though it talked more. GLM-5.2 iterated more, backtracked more, but still reached a working game.

On cost, the gap is stark. GLM-5.2 charges $1.40 per million input tokens and $4.40 per million output tokens. Opus charges $5 and $25 respectively. For a long agentic run, that difference compounds.

What is GLM-5.2 and why does open weights matter?

GLM-5.2 is Z.ai's flagship model, released under an MIT license. You can download the weights from Hugging Face or ModelScope and run it locally. Or you can call it through Z.ai's API or OpenRouter.

The model ships with a 1M-token context window and two thinking modes, High and Max, that trade latency for reasoning depth. It's built for long-horizon tasks, the kind of sustained coding work that runs for an hour or more.

One hard constraint: GLM-5.2 is text-only. It cannot read images. Any workflow that depends on screenshots, diagrams, or visual verification still needs a multimodal model. Opus can look at its own output and catch visual bugs. GLM cannot.

The open-weights angle is not just about cost. Closed models can disappear. Fable's recent deprecation reminded developers that an API you depend on can be shut down with little notice. Weights you download cannot be taken away. For teams building products on top of these models, that's a real risk consideration.

Why a WebGL platformer as the test?

The community already discounts zero-shot landing pages as a serious test. A model can produce something that looks impressive in one file. A 3D game in raw WebGL can't be faked that way. It requires a GLB parser, matrix math, shaders, animation, collision, a game loop. The pieces have to fit together across multiple files over many steps.

This tests two things at once. The agentic part: can the model hold a layered, multi-file build together over dozens of tool calls? The reasoning part: can it get engine internals right, the code that looks fine but quietly breaks?

Both models used the same CC0 assets from Kenney's Platformer Kit. The test was the engine and rendering, not asset loading.

Should you switch from Opus to GLM-5.2?

TechStackups says no, not as a primary. Opus was faster, shipped cleaner code, and can visually verify its output. For their main coding workflows, Opus stays.

But GLM-5.2 earns a permanent slot in the toolkit. At a quarter of the price, it handles long agentic runs well enough for many tasks. And the open weights mean it will always be available. The team's framing: Opus for the work that needs to be right on the first pass, GLM-5.2 for the work where you can iterate and the cost matters.

The comparison also surfaces a broader point about the open-versus-closed model debate. Open models are closing the capability gap. GLM-5.2 is not quite Opus, but it's close enough that for many tasks, the price difference makes it the better choice.

Pricing breakdown: GLM-5.2 vs Opus per million tokens

Model: Claude Opus 4.8, Input: $5.00, Cache read: $0.50, Output: $25.00. Model: GLM-5.2, Input: $1.40, Cache read: $0.26, Output: $4.40.

On output tokens, GLM-5.2 is less than a fifth the cost of Opus. For long runs that generate hundreds of thousands of tokens, this adds up fast.

ℹ️

Logicity's Take

The interesting signal here isn't that Opus is better. It's that GLM-5.2 is good enough. Two years ago, open models couldn't touch proprietary ones on complex coding tasks. Now the gap is speed and polish, not capability. For teams with budget constraints or long-running batch jobs, GLM-5.2 is a serious option. The open-weights insurance against API deprecation is a bonus that won't show up in benchmarks but matters when you're building a product.

Also Read
Google replaces Gemini's generateContent with Interactions API

Another major shift in how developers interact with foundation models

Frequently Asked Questions

Can GLM-5.2 process images like Claude Opus?

No. GLM-5.2 is text-only. It cannot read images, screenshots, or diagrams. Workflows that require visual input still need a multimodal model like Opus.

How much cheaper is GLM-5.2 than Claude Opus?

GLM-5.2 costs about 75-80% less. Output tokens are $4.40 per million versus $25 for Opus. Input tokens are $1.40 versus $5.

Is GLM-5.2 fully open source?

Yes. The weights are available under an MIT license on Hugging Face and ModelScope. You can run it locally with vLLM, SGLang, or Transformers.

Which model is better for coding agents?

Opus is faster and more accurate on first-pass output. GLM-5.2 is viable for long agentic runs where cost matters and you can tolerate more iteration.

ℹ️

Need Help Implementing This?

Choosing between open and closed models for your AI stack involves tradeoffs in cost, capability, and operational risk. If you're evaluating GLM-5.2, Opus, or other foundation models for production use, reach out to the Logicity team for implementation guidance.

Source: Hacker News: Best

M

Manaal Khan

Tech & Innovation Writer

Related Articles

Tesla's Remote Parking Feature: The Investigation That Didn't Quite Park Itself
Trending Tech·8 min

Tesla's Remote Parking Feature: The Investigation That Didn't Quite Park Itself

The US auto safety regulators have closed their investigation into Tesla's remote parking feature, but what does this mean for the future of autonomous driving? We dive into the details of the investigation and what it reveals about the technology. The National Highway Traffic Safety Administration found that crashes were rare and minor, but the investigation's closure doesn't necessarily mean the feature is completely safe.