MiniMax Unveils Open-Source AI Models Featuring Lightning Attention for Ultra-Long Contexts

On January 15, MiniMax, a Shanghai-based AI startup, announced the release of its next-generation MiniMax-01 models. This release includes both the foundational language model MiniMax Text-01 and the visual multimodal MiniMax VL-01.

The MiniMax-01 series features a hybrid structure that is innovative. 7 of the 8 layers use a linear Lightning attention mechanism, while only 1 layer retains the traditional SoftMax attention. This unique approach allows for near-linearity, allowing models to process inputs and outputs that are extremely long.

Since Google introduced the Transformer Architecture in 2017, it has become a dominant paradigm for large models. Since 2023, however, the natural-language processing field has seen an innovation wave, with an increasing demand for architectural advances.

SEE OTHERWISE: iQIYI Sues Minimax for Unauthorized Usage of Content in Model Training (19659002)

The Linear Attention Mechanism represents a step in the right direction, optimizing algorithms that transform quadratic growth of input length and computational difficulty into a linear relationship.

The MiniMax-01 models are capable of handling up to 4,000,000 tokens of context. They have 456 billion parameters, and 45.9 million activated per inference. This is 32 times more than GPT-4o and 20 times more than Claude-3.5Sonnet, making it the world’s first commercially viable model that uses Lightning Attention.

Hailuo AI, a global AI application, has already integrated the open-source MiniMax-01 models. MiniMax API allows developers and enterprises to access these capabilities. Pricing starts at Y=1 for every million input tokens, and Y=8 for every million output tokens.

Sign-up today to receive 5 articles per month for free!

MiniMax Unveils Open-Source AI Models Featuring Lightning Attention for Ultra-Long Contexts

Apple is reportedly planning new Vision Pro models, prioritizing Meta Ray-Ban...

OpenAI names new nonprofit ‘advisors’

ChatGPT 4.1 Early Benchmarks compared to Google Gemini

Apple details on-device Apple Intelligence training system using user data

Recomended

Apple is reportedly planning new Vision Pro models, prioritizing Meta Ray-Ban glasses rival

OpenAI names new nonprofit ‘advisors’

ChatGPT 4.1 Early Benchmarks compared to Google Gemini

Apple details on-device Apple Intelligence training system using user data

Kick’s cofounder discusses the creator push and growing-pains

Marketing Briefing: “Expecting Chaos”: With tariff uncertainty as a new constant in marketing, marketers use Covid’s replanning muscle.