Home Technology AI Hardware Inside Huawei’s plan to make thousands of AI chips think like one...

Inside Huawei’s plan to make thousands of AI chips think like one computer

0

Envision a vast network of thousands of advanced AI processors distributed across numerous server racks, seamlessly operating as a unified, colossal computing entity. This vision became reality at HUAWEI CONNECT 2025, where Huawei introduced a revolutionary AI infrastructure design poised to transform the construction and expansion of artificial intelligence systems worldwide.

Departing from conventional models where servers function independently or in small clusters, Huawei’s innovative SuperPoD technology integrates thousands of discrete processing units into a cohesive logical machine. This unified system is capable of collective learning, reasoning, and decision-making, effectively functioning as a single intelligent entity.

UnifiedBus 2.0: The Backbone of Next-Gen AI Infrastructure

Central to Huawei’s breakthrough is the UnifiedBus (UB) interconnect protocol, which underpins the SuperPoD architecture. Yang Chaobin, Huawei’s ICT Business Group CEO, highlighted that this protocol enables deep interconnection among physical servers, allowing them to operate as one logical server with synchronized cognitive capabilities.

Historically, scaling AI computing has been hindered by two main obstacles: maintaining reliable long-distance communication and balancing bandwidth with latency. Traditional copper cabling offers high bandwidth but is limited to short distances, typically connecting only a few cabinets. Conversely, optical fibers support longer distances but face reliability challenges that worsen as scale increases.

Eric Xu, Huawei’s Deputy Chairman, emphasized that overcoming these connectivity hurdles was critical. The UnifiedBus 2.0 protocol incorporates multi-layered reliability mechanisms-from the physical layer through to the network and transmission layers-featuring fault detection and protection switching at the nanosecond scale. This ensures that transient optical link disruptions remain invisible to applications, guaranteeing uninterrupted performance.

SuperPoD Architecture: Unprecedented Scale and Computing Power

The Atlas 950 SuperPoD exemplifies this architecture’s capabilities, housing up to 8,192 Ascend 950DT AI chips. Xu described its performance as delivering 8 exaFLOPS (EFLOPS) in FP8 precision and 16 EFLOPS in FP4 precision, with an interconnect bandwidth reaching 16 petabytes per second (PB/s). To put this in perspective, this bandwidth surpasses the combined peak internet bandwidth of the entire world by more than tenfold.

Spanning 160 cabinets over 1,000 square meters, the system includes 128 compute cabinets and 32 communication cabinets interconnected exclusively via optical links. It boasts a massive 1,152 terabytes (TB) of memory and maintains an ultra-low system latency of 2.1 microseconds.

Looking ahead, the Atlas 960 SuperPoD will scale even further, integrating 15,488 Ascend 960 chips across 220 cabinets covering 2,200 square meters. This next-generation system is projected to deliver 30 EFLOPS in FP8 and 60 EFLOPS in FP4, with 4,460 TB of memory and 34 PB/s interconnect bandwidth, setting new benchmarks for AI computing.

Expanding Horizons: SuperPoD for General-Purpose Computing

Beyond AI-specific tasks, Huawei’s SuperPoD concept extends to general enterprise computing through the TaiShan 950 SuperPoD, powered by Kunpeng 950 processors. This platform targets the replacement of outdated mainframes and mid-range servers, offering a modern, scalable alternative.

Particularly in the financial sector, Xu noted that the TaiShan 950 SuperPoD, combined with the distributed GaussDB database, presents a compelling substitute for legacy systems such as mainframes and Oracle’s Exadata servers, promising enhanced efficiency and cost-effectiveness.

Fostering Innovation Through Open Architecture

In a strategic move to accelerate AI infrastructure development, Huawei has made the UnifiedBus 2.0 technical specifications publicly available as open standards. This openness aims to cultivate a collaborative ecosystem, enabling partners to create customized SuperPoD solutions tailored to diverse industry needs.

Recognizing the ongoing challenges in semiconductor manufacturing, especially within China, Huawei’s leadership stressed the importance of leveraging currently accessible process nodes to sustain computing power growth. Yang Chaobin emphasized that their open-hardware and open-source-software strategy is designed to empower developers and foster innovation across the AI infrastructure landscape.

Huawei plans to open-source a comprehensive suite of hardware components-including NPU modules, air- and liquid-cooled blade servers, AI accelerator cards, CPU boards, and cascade cards. On the software front, the company commits to releasing the CANN compiler tools, Mind series application kits, and openPangu foundation models by the end of 2025.

Real-World Impact and Industry Adoption

Huawei’s advancements are already making waves in practical deployments. In 2025 alone, over 300 Atlas 900 A3 SuperPoD units have been delivered to more than 20 clients spanning sectors such as internet services, finance, telecommunications, energy, and manufacturing.

This open ecosystem approach addresses the constraints imposed by limited access to cutting-edge semiconductor fabrication, enabling broader participation in AI infrastructure innovation without reliance on the most advanced chip manufacturing technologies.

Globally, Huawei’s open architecture challenges the prevailing proprietary models favored by Western competitors, offering an alternative path that could democratize AI infrastructure development. The success of this model in delivering competitive performance and commercial viability at scale remains a critical test for the future.

Ultimately, the SuperPoD architecture signifies a paradigm shift in how vast computational resources are interconnected, managed, and scaled. By releasing its specifications and components as open-source, Huawei is betting on collaborative innovation to accelerate AI infrastructure evolution, potentially reshaping competitive dynamics in the global AI market.

Exit mobile version