Serving technology enthusiasts for more than 25 years. TechSpot is a trusted source for tech advice and analysis.
What just happened? According to reports, some of Nvidia’s largest enterprise customers are delaying orders for the latest Blackwell racks because of overheating problems and glitches with chip connectivity. Nvidia shares fell sharply by four percent in early trading after the news broke. The Information reports that Blackwell GB200 Racks, which are crucial components for data centers, have experienced issues during initial deployments. The problem is caused by the unprecedented power consumption of these cutting edge GPUs. Each rack draws a staggering 120 to 132kW. This extreme power density has pushed the limits of traditional cooling systems.
Initial shipments of Blackwell Racks revealed interconnect glitches that hindered efficient heat distribution, creating hotspots. The multi-chip module, which integrates 2 large GPU dies into a single package is a complex design that exacerbates heat management issues.
These thermal inefficiencies increase dramatically as deployments scale up, with configurations containing up to 72 Blackwell chip per rack. The current server rack design has proven inadequate to handle the extreme heat output, prompting Nvidia’s suppliers to request multiple modifications. This will require a combination between chip-level optimizations and the development of advanced cooling solutions. It may also require a complete overhaul of the server rack infrastructure.
Microsoft, for example, had originally planned to install GB200 Racks with at least 50,000 Blackwell Chips in one of its Phoenix Facilities. As delays grew, Microsoft’s main partner, OpenAI requested Nvidia’s older generation ‘Hopper’ chips instead.
It is unclear how these order reductions will affect Nvidia’s final sales despite these setbacks. Despite the reported problems, there may be other potential buyers of the GB200 racks.
Nvidia CEO Jensen Huang has denied media reports that the server overheated during initial testing. The server was a flagship liquid cooled server with 72 of the new Nvidia chips. Huang stated in November that the company is on track to surpass its earlier target for recording several billion dollars of revenue from Blackwell chip in its fourth quarter. Nvidia, Amazon, Microsoft, Google and Meta have all declined to comment.