AI & Machine Learning

Xiaomi's MiMo-V2.5-Pro Builds a Compiler in 4.3 Hours

Manaal Khan3 May 2026 at 1:03 pm5 دقيقة للقراءة

Key Takeaways

MiMo-V2.5-Pro completed a university-level compiler project in 4.3 hours with 672 tool calls
The model uses 40-60% fewer tokens than Claude Opus 4.6 or Gemini 3.1 Pro for similar tasks
With 1.02 trillion parameters and a 1 million token context window, it can run autonomously for over 11 hours

Xiaomi's AI lab has released MiMo-V2.5-Pro, an open-weight model that completed a university compiler project in 4.3 hours. The company says it matches Anthropic's Claude Opus 4.6 on coding benchmarks while burning through far fewer tokens.

The model is a mixture-of-experts architecture. It contains 1.02 trillion total parameters but activates only 42 billion per request. This design lets it handle tasks that run for hours without choking on compute costs.

4.3 hours

Time MiMo-V2.5-Pro took to build a complete compiler from a Peking University course, a task Xiaomi says typically takes CS students several weeks

How the Architecture Works

MiMo-V2.5-Pro processes audio, images, and text through separate encoders. Each encoder translates its input into a format the language model can understand. All three feed into the same backbone, letting the model reason across modalities.

MiMo-V2.5 architecture diagram showing audio, visual, and text inputs feeding into the MiMo Hybrid-SWA backbone.

The context window is among the largest available. The main version handles up to 1 million tokens at once. A base version without additional training caps out at 256,000 tokens. For comparison, Claude's current context window sits at 200,000 tokens.

The Compiler Demo

Xiaomi demonstrated the model's capabilities with three coding challenges. The headline demo involved building a compiler from a Peking University computer science course.

MiMo-V2.5-Pro worked through the compiler in four phases. Its first compile run already passed 137 of 233 tests. By the end, it hit 233 of 233, a perfect score on the hidden test suite.

Line chart showing test pass rates climbing across four phases of a compiler project.

The process took 4.3 hours and 672 tool calls. Xiaomi says the approach mattered as much as the result. The model first laid out the entire pipeline as scaffolding, then worked through each stage layer by layer. When a refactoring phase introduced a regression, the model diagnosed and fixed it on its own.

11 Hours of Autonomous Coding

The second demo pushed the model harder. MiMo-V2.5-Pro wrote a desktop video editor from just a few prompts. The final codebase hit roughly 8,000 lines.

This task ran for 11.5 hours with about 1,870 tool calls. The model worked without human intervention throughout, planning, writing, debugging, and refining the code on its own.

For the third demo, Xiaomi connected the model to a circuit simulator through Claude Code. The task was designing a voltage regulator. Within an hour, the result hit all six technical specifications.

Token Efficiency Claims

Xiaomi says MiMo-V2.5-Pro requires 40 to 60 percent fewer tokens than Claude Opus 4.6 or Gemini 3.1 Pro for comparable tasks. Fewer tokens means lower API costs and faster completion times.

Eight bar charts comparing MiMo-V2.5-Pro against Claude Opus 4.6, Gemini 3.1 Pro, and GPT-5.4 across coding, agent, and reasoning benchmarks.

These are internal benchmarks, not independent tests. The efficiency gap, if it holds in real-world use, would matter for companies running long autonomous tasks where token costs add up fast.

Open Weights vs Closed APIs

MiMo-V2.5-Pro is released with open weights. This means developers can download and run the model on their own hardware rather than paying per-token API fees.

Running a 1 trillion parameter model requires serious compute. Most organizations would need multi-GPU clusters. But for companies with that infrastructure, open weights offer control over data, customization options, and predictable costs.

What This Means for AI Coding Tools

Current AI coding assistants like GitHub Copilot or Cursor work well for short tasks. Autocomplete a function, explain a code block, fix a bug. They struggle with multi-hour projects that require sustained planning.

MiMo-V2.5-Pro is built for the opposite use case. Its 1 million token context and mixture-of-experts efficiency let it tackle projects that take hours and thousands of tool calls. If the demos reflect real capability, this is a different category of coding assistant.

Screenshot of MiMo Studio with the model dropdown open, highlighting the MiMo Chat sidebar entry and three TTS variants.

Six bar charts comparing word error rates across MiMo-V2.5-ASR, Qwen3-ASR-1.7B, Seed-ASR 2.0, Whisper-Large-V3, FunASR-1.5, and Gemini-3.1-Pro.

ℹ️

Logicity's Take

Frequently Asked Questions

What is MiMo-V2.5-Pro?

MiMo-V2.5-Pro is Xiaomi's new open-weight AI model with 1.02 trillion parameters, designed for long-running autonomous coding tasks.

How does MiMo-V2.5-Pro compare to Claude Opus?

According to Xiaomi's internal benchmarks, MiMo-V2.5-Pro lands close to Claude Opus 4.6 on coding tasks while using 40-60% fewer tokens.

Can I run MiMo-V2.5-Pro locally?

Yes, the model has open weights. However, running a 1 trillion parameter model requires significant GPU infrastructure.

What is the context window size for MiMo-V2.5-Pro?

The main version handles up to 1 million tokens. The base version without retraining caps at 256,000 tokens.

What tasks has MiMo-V2.5-Pro completed in demos?

Xiaomi showed it building a complete compiler in 4.3 hours, writing an 8,000-line video editor in 11.5 hours, and designing a voltage regulator circuit in under an hour.

ℹ️

Need Help Implementing This?

Source: The Decoder / Jonathan Kemper

مقالات ذات صلة

تصفح الكل

AI & Machine Learning·6 د

GLM-5.2 يقترب من عرش النماذج المغلقة في سباق البرمجة الماراثونية

في خطوة تعيد رسم خريطة المنافسة بين النماذج المفتوحة والمغلقة، أطلق مختبر Zhipu AI الصيني نموذج GLM-5.2 الذي يحقق أداءً يكاد يلامس قمة النماذج التجارية المغلقة في مهام البرمجة الماراثونية. النموذج الج

١٨ يونيو ٢٠٢٦

AI & Machine Learning·5 د

أزمة Fable: من المسؤول عن إغلاق نماذج Anthropic — البيت الأبيض أم الشركة؟

في مساء الجمعة من منتصف يونيو 2026، اتخذ البيت الأبيض قراراً غير مسبوق أربك صناعة الذكاء الاصطناعي بأكملها: فرض قيود تصدير طارئة على نموذجَي Fable 5 وMythos 5 من شركة Anthropic، ما أجبر الشركة على إيق

١٨ يونيو ٢٠٢٦

AI & Machine Learning·6 د

أسطول روبوتات Nvidia يُدرِّب نفسه ذاتياً عبر وكلاء برمجة بالذكاء الاصطناعي

نجحت Nvidia بالتعاون مع جامعتي Carnegie Mellon وUC Berkeley في تحويل مختبر روبوتات إلى منظومة ذاتية التحسين، حيث تُدرِّب روبوتات ذاتية التدريب نفسها على مهام معقدة دون الحاجة إلى إشراف بشري مستمر. أسط

١٨ يونيو ٢٠٢٦

AI & Machine Learning·5 د

إنفاق عمالقة التقنية على الذكاء الاصطناعي قد يتجاوز تدفقاتهم النقدية بحلول الربع الثالث من 2026

يواجه عمالقة التقنية الخمسة — Microsoft وAmazon وAlphabet وMeta وOracle — لحظة فارقة في تاريخهم المالي: إنفاقهم المتسارع على البنية التحتية للذكاء الاصطناعي بات يهدد بتجاوز قدرتهم على تمويله ذاتياً. و

١٨ يونيو ٢٠٢٦

Xiaomi's MiMo-V2.5-Pro Builds a Compiler in 4.3 Hours

Key Takeaways

How the Architecture Works

The Compiler Demo

11 Hours of Autonomous Coding

Token Efficiency Claims

Open Weights vs Closed APIs

What This Means for AI Coding Tools

Logicity's Take

Frequently Asked Questions

Need Help Implementing This?

مقالات ذات صلة

GLM-5.2 يقترب من عرش النماذج المغلقة في سباق البرمجة الماراثونية

أزمة Fable: من المسؤول عن إغلاق نماذج Anthropic — البيت الأبيض أم الشركة؟

أسطول روبوتات Nvidia يُدرِّب نفسه ذاتياً عبر وكلاء برمجة بالذكاء الاصطناعي

إنفاق عمالقة التقنية على الذكاء الاصطناعي قد يتجاوز تدفقاتهم النقدية بحلول الربع الثالث من 2026

اقرأ أيضاً

أمازون تلغي فيلماً عن OpenAI بعد صفقة بـ50 مليار دولار مع سام ألتمان

Nothing تلمّح إلى جهاز غامض بعد نفي إطلاق هاتف CMF جديد هذا العام

3 أفلام وثائقية على Paramount+ تستحق وقتك هذا الأسبوع