M2.1 Crushes Agent Benchmarks: The MoE Model That Outperforms at 10B Activation

MiniMax’s new M2.1 model has emerged as a powerhouse in AI agent performance, considerably outperforming competitors across multiple benchmarks. The model uses a clever Mixture-of-Experts (MoE) design that activates only about 10 billion parameters during each use, while keeping a much larger pool of parameters for overall capacity. This smart approach cuts down on computing power needed while still delivering top results.

The model shines especially in coding tasks, scoring 49.4% on Multi-SWE-Bench and an impressive 72.5% on certain parts of SWE-Bench Multilingual. It also achieved an 88.6 average on the VIBE benchmark, with particularly strong scores in web and Android development categories (91.5 and 89.7 respectively). These numbers put it in direct competition with leading closed-source models like Gemini 3 Pro and Claude 4.5 Sonnet.

M2.1’s secret weapon is its sparse routing system. This means it sends each token to just a small group of experts rather than using all parameters at once. The experts are specially tuned for code, multi-language support, tool use, and reasoning chains. This makes the model especially good at working with agent systems like Claude Code, Droid/Factory AI, and Kilo Code.

Users report better performance when the model works with tools and follows multi-step instructions. The model also incorporates a novel Interleaved Thinking approach for enhanced problem-solving capabilities. It makes fewer mistakes when switching between different tasks and can work with various programming languages including Rust, Java, Go, C++, and TypeScript. The model is also more efficient with tokens, which means lower costs when using it for long conversations. This efficiency is enhanced by its ability to deliver cleaner, concise outputs compared to previous models.

Perhaps most impressive is that M2.1 can run on consumer-level GPUs thanks to careful quantization and optimization. This makes powerful AI accessible to more developers who don’t need expensive cloud setups. The model’s efficient design aligns with efforts to optimize energy use while maintaining high performance computing capabilities.

As agent systems become more common in software development, M2.1’s specialized design gives it a clear edge in the growing AI assistant market.