AI is about to be injected into scientific computing

Revolutionizing Scientific Computing: The AI-Driven Transformation of HPC

As artificial intelligence (AI) becomes increasingly embedded in computational workloads, the landscape of scientific computing is poised for a profound transformation. Industry leaders anticipate that within the next year, AI will be deeply integrated into high-performance computing (HPC) tasks and scientific research applications, accelerating discovery and innovation at an unprecedented pace.

AI’s Expanding Role in Scientific Research

Ian Buck, Vice President and General Manager of Hyperscale HPC at Nvidia, highlights the rapid adoption of AI in scientific domains. He notes that while AI is already demonstrating remarkable potential to expedite scientific breakthroughs, the true revolution lies ahead as computing architectures evolve to better support these workloads.

“We are currently witnessing pioneering workloads that showcase how AI can dramatically enhance the speed and efficiency of scientific discovery,” Buck explains. He points to the surge of GPU-accelerated supercomputers on the Top500 list as evidence of this shift, with over 80% of the most powerful machines now leveraging GPU technology-a trend that took roughly five years to reach this tipping point.

Clarifying AI’s Role: Complement, Not Replacement

Contrary to some misconceptions, AI is not poised to supplant traditional simulation methods in scientific computing. Buck emphasizes that AI operates fundamentally as a statistical tool, employing machine learning to analyze data and predict outcomes based on probabilities rather than exact numerical precision.

“The question shouldn’t be whether AI will replace simulation-it won’t. Instead, AI serves as a powerful instrument among many, designed to augment scientific inquiry,” he states. For example, when exploring the crystalline structures of novel alloys intended to improve jet engine performance, exhaustive simulation of every molecular configuration would be impractical, potentially taking thousands of years. AI helps by predicting promising candidates, enabling researchers to focus computational resources more effectively.

Bridging AI and HPC: Nvidia’s Software Ecosystem

To facilitate the convergence of AI and HPC, Nvidia has developed a suite of specialized software frameworks tailored to diverse scientific fields. These include Holoscan for advanced sensor data processing, BioNeMo to accelerate drug discovery pipelines, and Alchemi for computational chemistry applications.

Recently, Nvidia introduced Apollo, a collection of open-source models designed to boost industrial and computational engineering workflows. Apollo integrates seamlessly with leading industrial design platforms from Cadence, Synopsys, and Siemens, enhancing productivity and innovation.

Looking ahead, Nvidia is also pioneering the integration of quantum computing with classical HPC systems through its NVQLink technology. This interface connects quantum processing units (QPUs) to Nvidia-powered machines, enabling hybrid quantum-classical workloads on the CUDA Quantum platform, which is expected to unlock new frontiers in scientific computation.

Precision Computing: The Enduring Importance of FP64

Despite the rise of AI models favoring low-precision data types such as FP8 and FP4, double-precision floating-point (FP64) computation remains indispensable for many scientific simulations. Nvidia continues to support FP64 capabilities in its accelerators, recognizing its critical role in delivering the accuracy required for academic and industrial research.

While the recent Nvidia Blackwell GPU series showed a reduction in FP64 matrix performance-from 67 teraFLOPS to approximately 40-45 teraFLOPS depending on the model-FP64 vector performance, which better reflects real-world HPC workloads like the High Performance Conjugate Gradient (HPCG) benchmark, actually improved from 34 to 45 teraFLOPS.

Some concerns arose when Nvidia’s Blackwell Ultra variant reallocated die space to enhance AI-focused low-precision performance, leading to speculation about deprioritizing HPC. Buck clarifies that Blackwell Ultra is a specialized AI inference design, and Nvidia’s upcoming Rubin generation will offer a balanced portfolio: some devices optimized for mixed-precision workloads, others tailored specifically for AI acceleration.

Notably, the Rubin CPX chip is engineered to offload large language model (LLM) prefill operations from GPUs, accelerating token processing and improving overall efficiency.

Supercomputing Milestones and Future Prospects

Nvidia’s influence in the supercomputing arena continues to grow, having secured over 80 contracts in the past year alone, collectively delivering 4,500 exaFLOPS of AI compute power. Among these is the Texas Advanced Computing Center’s Horizon supercomputer, slated for deployment in 2026.

Horizon will feature 4,000 Blackwell GPUs alongside 9,000 Nvidia Vera CPUs, capable of achieving 300 petaFLOPS in FP64 precision and an astounding 80 exaFLOPS in AI compute (FP4). This system will empower researchers to simulate molecular dynamics for virus research, investigate cosmic phenomena such as star and galaxy formation, and analyze seismic waves to enhance earthquake early warning systems.

In partnership with Oracle, Nvidia is also contributing to the construction of seven next-generation supercomputers for the U.S. Department of Energy, including the largest ever built, underscoring the critical role of AI and HPC convergence in tackling the world’s most complex scientific challenges.

More from this stream

Recomended