MiniMax Releases MiniMax M2: A Mini Open Model Built for Max Coding and Agentic Workflows at 8% Claude Sonnet Price and ~2x Faster

Is it possible for an open-source Mixture of Experts (MoE) model to efficiently drive agentic coding workflows at a fraction of the cost of leading flagship models, while maintaining robust long-term tool integration across multiple environments such as MCP, shell, browser, retrieval, and coding? The MiniMax team has unveiled MiniMax-M2, a specialized MoE model tailored for coding and agent-driven workflows. This model’s weights are openly available on Hugging Face under the MIT license. Designed for comprehensive tool utilization, multi-file editing, and extended planning horizons, MiniMax-M2 boasts 229 billion total parameters, with approximately 10 billion active per token. This selective activation strategy helps manage memory consumption and latency during iterative agent operations.

MiniMax-M2 architecture overview — MiniMax-M2 model architecture and workflow overview

Understanding MiniMax-M2’s Architecture and the Importance of Activation Size

MiniMax-M2 is a streamlined MoE model that activates roughly 10 billion parameters per token, significantly reducing memory overhead and minimizing tail latency during the critical plan-act-verify cycles common in agent workflows. This efficient activation budget enables faster processing speeds and lower operational costs compared to dense models with similar performance levels, making it ideal for concurrent execution in continuous integration (CI), browsing, and retrieval pipelines.

One of the model’s unique features is its “interleaved thinking” mechanism. The development team encapsulates internal reasoning within <think>...</think> tags, instructing users to preserve these segments throughout multi-turn interactions. Omitting these reasoning blocks has been shown to degrade performance in complex, multi-step tasks and chained tool executions, underscoring their critical role in maintaining high-quality outputs.

Performance Benchmarks Focused on Coding and Agentic Tasks

The MiniMax team emphasizes evaluations that mirror real-world developer workflows rather than static question-answering benchmarks. On the Terminal Bench, MiniMax-M2 achieves a score of 46.3, while on the Multi SWE Bench it records 36.2. The model scores 44.0 on BrowseComp and reaches 69.4 on the SWE Bench Verified, utilizing a scaffolded approach with OpenHands, supporting 128k context windows and up to 100-step reasoning chains. These results highlight the model’s strength in handling complex coding and agentic scenarios.

MiniMax-M2 benchmark results — Detailed benchmark results for MiniMax-M2

According to the official announcement, MiniMax-M2 operates at just 8% of the cost of Claude Sonnet, while delivering nearly twice the speed. The release also includes a limited-time free access period, with transparent token pricing and trial deadlines clearly outlined for users.

Comparing MiniMax M1 and M2: Evolution in Design and Capabilities

Feature	MiniMax M1	MiniMax M2
Total Parameters	456 billion	229 billion (model card states 230B)
Active Parameters per Token	45.9 billion	10 billion
Core Architecture	Hybrid MoE with Lightning Attention	Sparse MoE optimized for coding and agent workflows
Reasoning Format	Thinking budget variants (40k and 80k tokens) in RL training; no mandatory think tags	Interleaved thinking with mandatory `<think>...</think>` tags preserved across turns
Benchmark Suites	AIME, LiveCodeBench, SWE-bench Verified, TAU-bench, long context MRCR, MMLU-Pro	Terminal-Bench, Multi SWE-Bench, SWE-bench Verified, BrowseComp, GAIA text-only, Artificial Analysis Intelligence suite
Default Inference Settings	Temperature 1.0, top-p 0.95	Temperature 1.0, top-p 0.95, top-k 20-40 (varies by documentation)
Serving Recommendations	vLLM preferred; Transformers also supported	vLLM and SGLang recommended; includes detailed tool-calling guides
Primary Use Cases	Long-context reasoning, efficient test-time compute scaling, CISPO reinforcement learning	Native support for agentic and coding workflows across shell, browser, retrieval, and code execution

Essential Insights and Highlights

MiniMax-M2 is fully open-source with weights available on Hugging Face under the permissive MIT license, provided in safetensors format supporting FP32, BF16, and FP8 (F8_E4M3) precision.
The model’s compact MoE design activates only about 10 billion parameters per token out of 229 billion total, enabling reduced memory usage and consistent latency during iterative agent planning and verification loops.
Internal reasoning is explicitly wrapped in <think>...</think> tags, which must be preserved in conversation histories to maintain performance in multi-step and tool-augmented tasks.
Benchmark results span multiple agent and coding-focused suites including Terminal-Bench, Multi SWE-Bench, and BrowseComp, with detailed scaffolding for reproducibility. Deployment guides for SGLang and vLLM are provided to facilitate immediate use.

Final Thoughts

MiniMax-M2 represents a significant advancement in open-source MoE models, delivering a balance of scale, efficiency, and specialized capabilities for agentic coding workflows. Its release under the MIT license with comprehensive deployment documentation and compatibility with popular serving frameworks like vLLM and SGLang makes it a compelling choice for developers and researchers seeking cost-effective, high-performance models. The model’s innovative interleaved thinking approach and optimized activation budget position it well for complex, multi-step tool use cases, setting a new standard for open-source AI in coding and agentic applications.

MiniMax Releases MiniMax M2: A Mini Open Model Built for Max Coding and Agentic Workflows at 8% Claude Sonnet Price and ~2x Faster

Understanding MiniMax-M2’s Architecture and the Importance of Activation Size

Performance Benchmarks Focused on Coding and Agentic Tasks

Comparing MiniMax M1 and M2: Evolution in Design and Capabilities

Essential Insights and Highlights

Final Thoughts

African startups have $60B in return. How will they do it?

Google Launches New AI Scam detection in Circle to Search, Google...

Black Friday deals under 50 dollars: Apple AirTags Legos Ugreen chargers...

Google rolling out Gemini 3 Deep Think for AI Ultra

Recomended

African startups have $60B in return. How will they do it?

Google Launches New AI Scam detection in Circle to Search, Google Lens and Google Lens

Black Friday deals under 50 dollars: Apple AirTags Legos Ugreen chargers Blink cameras and other items

Google rolling out Gemini 3 Deep Think for AI Ultra

OpenAI says ChatGPT can save the average worker an hour per day

OpenAI boasts enterprise win days after internal ‘code red’ on Google threat