On January 15, MiniMax, a Shanghai-based AI startup, announced the release of its next-generation MiniMax-01 models. This release includes both the foundational language model MiniMax Text-01 and the visual multimodal MiniMax VL-01.
The MiniMax-01 series features a hybrid structure that is innovative. 7 of the 8 layers use a linear Lightning attention mechanism, while only 1 layer retains the traditional SoftMax attention. This unique approach allows for near-linearity, allowing models to process inputs and outputs that are extremely long.
Since Google introduced the Transformer Architecture in 2017, it has become a dominant paradigm for large models. Since 2023, however, the natural-language processing field has seen an innovation wave, with an increasing demand for architectural advances.
SEE OTHERWISE: iQIYI Sues Minimax for Unauthorized Usage of Content in Model Training (19659002)
The Linear Attention Mechanism represents a step in the right direction, optimizing algorithms that transform quadratic growth of input length and computational difficulty into a linear relationship.
The MiniMax-01 models are capable of handling up to 4,000,000 tokens of context. They have 456 billion parameters, and 45.9 million activated per inference. This is 32 times more than GPT-4o and 20 times more than Claude-3.5Sonnet, making it the world’s first commercially viable model that uses Lightning Attention.
Hailuo AI, a global AI application, has already integrated the open-source MiniMax-01 models. MiniMax API allows developers and enterprises to access these capabilities. Pricing starts at Y=1 for every million input tokens, and Y=8 for every million output tokens.
Sign-up today to receive 5 articles per month for free!