Huawei Atlas 950 SuperCluster revealed with 524 ExaFLOPS of AI performance

Image: Huawei’s Atlas 950 SuperCluster AI system

At the Huawei Connect 2025 event, Huawei unveiled its groundbreaking Atlas 950 SuperCluster, a powerhouse designed to revolutionize AI computing. This system boasts an impressive 524 FP8 ExaFLOPS for AI training workloads and achieves 1 FP4 ZettaFLOPS for AI inference tasks. Powered by more than half a million Ascend 950DT Neural Processing Units (NPUs), the Atlas 950 sets a new benchmark, surpassing competitors like Oracle’s OCI SuperCluster and xAI’s Colossus, while positioning itself as a formidable rival to Nvidia’s anticipated Rubin platform.

Innovative Architecture and Unmatched Performance

The Atlas 950 SuperCluster integrates over 10,240 Ascend 950DT NPUs within a modular design consisting of 64 Atlas SuperPoDs. Each SuperPoD contains 8,192 Ascend 950DT processors, collectively delivering a twentyfold increase in processing capability compared to Huawei’s previous Atlas 900 A3 system. This architecture supports both the industry-standard RoCE (Reliable Communications over Ethernet) and Huawei’s proprietary UBoE protocol, ensuring ultra-reliable, low-latency communication.

Notably, the system employs an all-optical interconnect with a staggering bandwidth of 16 petabytes per second and a latency as low as 2.1 microseconds, enabling rapid data transfer essential for large-scale AI model training and inference.

Outperforming Industry Giants

When compared to Oracle’s OCI SuperCluster, which utilizes 131,072 B200 GPUs and delivers 2.4 FP4 ZettaFLOPS, Huawei’s Atlas 950 offers superior inference performance. It also eclipses xAI’s Colossus by incorporating 2.5 times more NPUs, providing significantly greater computational density. Each Atlas 950 SuperPoD delivers 8 FP8 ExaFLOPS, a substantial leap over Nvidia’s Vera Rubin NVL144, which offers 1.2 FP8 ExaFLOPS.

Designed to accommodate AI models with parameter counts ranging from hundreds of billions to tens of trillions, the Atlas 950 SuperCluster is tailored for next-generation AI research and development, meeting the demands of cutting-edge machine learning workloads.

Massive Scale and Infrastructure Requirements

Each SuperPoD occupies approximately 1,000 square meters-equivalent to the size of two professional basketball courts-and houses 160 server cabinets. The entire Atlas 950 SuperCluster spans an enormous 64,000 square meters, comparable to 150 basketball courts or nine full-sized soccer fields. This extensive footprint is necessary due to the high number of accelerators deployed, which require substantial cooling, power delivery, and support infrastructure.

Looking Ahead: The Atlas 960 and Beyond

Huawei plans to launch the Atlas 950 SuperCluster by the end of 2026, with the even more powerful Atlas 960 SuperCluster slated for release in 2027. The upcoming Atlas 960 is expected to feature over one million NPUs and deliver between two to four ZettaFLOPS of computing power, further pushing the boundaries of AI processing capabilities.

Huawei’s strategy focuses on scaling up the number of processing units to compensate for individual chip performance, enabling the creation of colossal AI supercomputers. These systems are specifically engineered to support AI enterprises requiring unprecedented computational resources to develop advanced models and applications.

More from this stream

Recomended