Technology

Samsung AI researcher’s new, open reasoning model TRM outperforms models 10,000X larger — on specific problems

October 9, 2025

In recent developments within the AI research community, a remarkable trend has emerged: the creation of compact, open-source generative models that rival or even outperform their vastly larger, proprietary counterparts. This week, a significant breakthrough was announced, further cementing this movement.

Introducing TRM: A Compact Neural Network with Outsized Capabilities

Alexia Jolicoeur-Martineau, a senior AI researcher at Samsung’s Advanced Institute of Technology (SAIT) in Montreal, unveiled a neural network named TRM, which contains a mere 7 million parameters. Despite its modest size, TRM matches or exceeds the performance of state-of-the-art language models that are up to 10,000 times larger, including notable giants like OpenAI’s o3-mini and Google’s Gemini 2.5 Pro, on some of the most challenging reasoning benchmarks in AI research.

Rethinking AI Model Development: Efficiency Over Scale

The primary motivation behind TRM is to demonstrate that high-performing AI systems can be developed without the exorbitant costs associated with training massive, multi-trillion parameter models. These large models typically require extensive GPU resources and power consumption, limiting accessibility. The findings were detailed in a research paper published openly, emphasizing a shift away from the prevailing notion that only colossal models can tackle complex tasks.

Jolicoeur-Martineau critiques the current AI landscape, stating, “Relying solely on enormous foundational models trained at great expense by large corporations is a misconception. The field is overly focused on exploiting existing large language models rather than innovating new methodologies.”

She further highlights the power of recursive reasoning, noting, “A small model, trained from scratch and capable of iteratively refining its own outputs, can achieve remarkable results without the need for massive computational resources.”

Open Access and Commercial Viability

TRM’s source code is publicly available under the permissive MIT License, allowing researchers and enterprises alike to adapt and deploy the model for various applications, including commercial use. This open approach encourages broader experimentation and adoption beyond large corporate labs.

Specialized Focus: Structured, Grid-Based Reasoning

It is important to note that TRM was specifically engineered to excel at structured, visual, grid-oriented problems such as Sudoku, maze navigation, and tasks from the ARC (Abstraction and Reasoning Corpus) benchmark. These tasks are designed to be straightforward for humans but notoriously difficult for AI, involving challenges like sorting colors on a grid based on a similar but not identical prior solution.

Architectural Innovation: From Complexity to Elegance

TRM represents a significant simplification compared to previous models. It builds on the earlier Hierarchical Reasoning Model (HRM), which utilized two interacting networks operating at different frequencies, inspired by biological processes and supported by complex mathematical theories like fixed-point theorems.

Jolicoeur-Martineau found this dual-network approach unnecessarily intricate. Instead, TRM employs a streamlined design: a single two-layer neural network that recursively refines its predictions. Starting with an embedded question and an initial guess, represented by variables x, y, and z, the model iteratively updates its internal state and answer until reaching a stable conclusion. This process corrects errors progressively without the need for additional hierarchical structures or complex mathematics.

Recursion as a Substitute for Model Size

The fundamental insight behind TRM is that recursive processing can replace the need for deep and large architectures. By repeatedly reasoning over its own outputs-up to sixteen iterations-the model simulates the effect of a much deeper network while maintaining a lightweight, feed-forward structure. This approach mirrors the multi-step “chain-of-thought” reasoning used by large language models but achieves it with far fewer parameters and computational demands.

Efficiency is further enhanced by a halting mechanism that determines when the model’s output has sufficiently converged, preventing unnecessary computation and preserving accuracy.

Benchmark Performance: Small Model, Big Impact

Despite its compact size, TRM delivers impressive results on several challenging benchmarks:

87.4% accuracy on Sudoku-Extreme, a significant improvement over HRM’s 55%
85% accuracy on Maze-Hard puzzles
45% accuracy on ARC-AGI-1
8% accuracy on ARC-AGI-2

These outcomes rival or surpass those of much larger models such as DeepSeek R1, Gemini 2.5 Pro, and o3-mini, despite TRM operating with less than 0.01% of their parameter counts. This suggests that recursive reasoning, rather than sheer scale, may be the key to mastering abstract and combinatorial reasoning tasks-areas where even leading generative models often falter.

Minimalism as a Design Principle

TRM’s success is rooted in a philosophy of simplicity. Jolicoeur-Martineau observed that increasing the model’s size or depth led to overfitting on limited datasets, reducing performance. The two-layer architecture combined with recursive depth and deep supervision struck the optimal balance.

Interestingly, replacing self-attention mechanisms with a simpler multilayer perceptron improved results on tasks with small, fixed contexts like Sudoku. However, for larger, more complex grids such as those in ARC puzzles, self-attention remained beneficial. These findings emphasize the importance of tailoring model architecture to the nature and scale of the data rather than defaulting to maximal complexity.

Accessible Training and Reproducibility

TRM is fully open source under the MIT license, with comprehensive training and evaluation scripts, dataset generators for Sudoku, Maze, and ARC-AGI, and reference configurations to replicate published results. Training requirements range from a single NVIDIA L40S GPU for Sudoku to multi-GPU H100 clusters for ARC-AGI, making the model accessible to a wide range of researchers.

The model’s training pipeline incorporates extensive data augmentation techniques, including color permutations and geometric transformations, highlighting that its efficiency stems from parameter economy rather than reduced computational effort.

By removing the biological analogies, multiple network hierarchies, and fixed-point dependencies present in HRM, TRM offers a transparent and reproducible foundation for exploring recursive reasoning in compact models. This challenges the prevailing “scale is all you need” mindset dominating AI research.

Community Perspectives and Debates

The release of TRM sparked lively discussions among AI experts and practitioners. Many applauded the achievement as evidence that smaller models can outperform their larger counterparts, describing it as a promising step toward AI architectures that emphasize reasoning over brute force scaling.

However, some critics pointed out that TRM’s applicability is limited to narrowly defined, grid-based puzzles, and that its computational savings primarily arise from parameter reduction rather than overall runtime efficiency. Others noted that the model’s reliance on heavy data augmentation and multiple recursive passes effectively increases compute demands despite its small size.

Experts also emphasized that TRM is a specialized solver rather than a general-purpose language model, excelling in structured reasoning but not designed for open-ended text generation. The consensus suggests that while TRM’s domain is specific, its underlying message-that careful recursive reasoning can substitute for scale-holds broad implications for future AI research.

Future Directions: Expanding Recursive Reasoning

Looking forward, Jolicoeur-Martineau envisions extending TRM’s recursive framework to generative or multi-solution variants, enabling the model to propose multiple plausible answers rather than a single deterministic output. Another key research avenue involves investigating scaling laws for recursion to understand how the “less is more” principle applies as model complexity and dataset sizes increase.

Ultimately, TRM serves both as a practical tool and a conceptual challenge to the AI community: advancing artificial intelligence does not necessarily require ever-larger models. Instead, empowering smaller networks to think deeply and recursively may unlock new frontiers in reasoning capabilities.

Loading…

Here are the results for the search: "{{td_search_query}}"

No results!

{{post_title}}