Home News The $100B Memory War: Inside the Battle for AI’s Future

The $100B Memory War: Inside the Battle for AI’s Future

0
The $100B Memory War: Inside the Battle for AI’s Future

Feature. Generative AI has revealed a brutal truth. Raw computing power is useless if you don’t have the resources to feed it. Memory bandwidth is the real bottleneck in AI datacenters with thousands of GPUs.

After decades of engineers obsessing over FLOPS, the industry is now faced with a new iron rule: if your trillion-dollar AI system can’t be moved fast enough, it will become an expensive paperweight.

Stargate is not big enough for OpenAI’s tie ups with AMD and Nvidia to work

READ ABOUT

High Bandwidth Memory 4 is a 3D-stacked technology that promises an unprecedented bandwidth per chip. This could determine whether companies dominate the AI landscape or disappear. This isn’t a simple incremental upgrade. It’s the difference between training a breakthrough AI model within weeks or months, between profitable inference or burning cash with each query. JEDEC released the HBM4 memory specification for high-performance AI earlier this year. The new version has a higher speed per pin and wider interface than its HBM3 counterpart. It targets 8 Gbps per bit across a 2,048 bit interface for 2TB/s bandwidth per memory stack. This is roughly twice as fast as current HBM3 chips. This will be a major development for AI accelerators.

A second improvement is the increase in capacity. HBM4 allows for taller stacks of up to 16 memory dies (16 memory dies are bonded), and per-die densities as high as 24 Gb or even 32 Gb. This allows a maximum of 64GB per stack. A single HBM4 module can hold as much information as the entire memory of a high-end GPU today. HBM4 was designed to be power efficient despite the speed boost. It allows for lower I/O and core voltages to improve energy efficiency. These advances are aimed at generative AI. GPUs are constantly moving terabytes through them to train large language models, or run giant recommendation systems. This bottleneck can be reduced by using a faster, wider memory. Each GPU can then process data faster.

But developing and manufacturing HBM4 is a bigger challenge. Only three memory vendors SK hynix Micron and Samsung have the required DRAM and 3D-stacking expertise to deliver HBM4 at scale. Their success or failure to achieve mass production will have an impact on the AI Hardware roadmaps of companies like Nvidia AMD and Broadcom, for upcoming GPUs, and AI Accelerators.

SK hynix has been positioned as the leader in HBM4. It has a history of HBM firsts. It was the first to supply HBM2 for AMD GPUs and HBM2E and HBM3 to major customers. Counterpoint Research estimates that SK hynix’s Q2 2025 share is 62 percent. This is a significant advantage over its competitors. This dominance is a result of its strong alliance with Nvidia.

Even though the official JEDEC specification had not yet been released, SK hynix began sampling HBM4 even before it was officially released. It shipped the first 12-layer HBM4 sample in March 2025 to demonstrate that it had the stacking technologies ready. SK hynix has announced that its HBM4 design is complete and it has begun production. Prepared it for high-volume productionJoohwan Cho is the head of HBM Development at SK hynix.

The first rule in liquid cooling is to “Don’t wet chip”. Microsoft disagrees

READ ABOUT

SK hynix has confirmed that by September 2025 its HBM4 will meet all specifications. It runs at 10 GT/s, which is 25 per cent faster than the baseline of 8 GT/s. SK hynix’s 10 GT/s class is right on the mark for Nvidia’s requirements for Blackwell-generation graphics cards. SK hynix suggested that its design may exceed the JEDEC specifications, presumably in order to give Nvidia the performance headroom they require.

SK hynix uses its proven 1b process (a fifth generation 10nm node), for the HBM4 dies. This slightly older node is still a good choice for stacking 12 dies together. It offers a lower defect density and better yields. SK hynix did not disclose the node for the base logic die, which sits beneath the DRAM layers. Speculation suggests that either the TSMC 5 nm or 12 nm classes could be used.

According to the company’s philosophy, “make it work reliably first, then push performance,” is in line with HBM’s conservative and steady leadership. SK hynix will ramp up HBM4 production in late 2025 if customers demand it. Although the company hasn’t announced a specific ship date, all signs suggest that volume shipments could start in early 2026 once final qualifications are complete.

Nvidia flagship GPUs will be the first destination. Industry reports indicate that SK hynix HBM4 is the first to be integrated into Rubin GPU platform. SK hynix and Nvidia have a close relationship, so it’s likely they will supply most of the first memory modules for Blackwell GPUs by 2026. This puts SK hynix on the front foot to be the first to ship HBM4 in large quantities.

SK hynix’s market leadership is also translating into substantial financial gains for this year. In Q2 2025 the company reported that 77 percent of their sales were related to HBM and AI memory. Despite its current dominance, HBM4’s race isn’t yet over. Their rivals are chasing them hard.

Volume challenger

Micron is a latecomer in the HBM market. In the last year, Micron has surpassed Samsung, with a market share of 21 percent, compared to 17 percent for Samsung. This is a major development, as Micron had no HBM presence just a few short years ago. The surge in demand for generative AI has been the catalyst.

OpenNvidia may become the AI generation’s WinTel (19659023) READ EVEN MORE

HBM3E is the foundation of this success. It has secured supply agreements for GPUs and accelerators with six HBM customers. Micron became a supplier of Nvidia AI graphics cards. This was due to the fact that Nvidia historically sourced memory from multiple suppliers in order to ensure redundancy. Micron and SK hynix got a piece of this pie.

  • Micron is close to selling the entire high-bandwidth Memory it will make by 2026
  • Uncle Sam doesn’t like Samsung, SK hynix to make memories in China
  • OpenAI is a platform that has no allegiances. It just computes at all costs.
  • Micron’s HBM division is expected to grow significantly by late 2025. The company reported that in the quarter ending September 2025, the revenue for the HBM’s business has reached almost $2 billion. HBM has gone from a niche product into a double-digit percent of the company’s revenue in a short period of time. Micron has even said that the HBM production for 2025 and 2026 are fully booked.

    Micron started shipping HBM4 samples to customers in June 2025, riding on this momentum. The technology was able to provide 36 GB of 12-high stacks for key customers. One of them is Nvidia. Micron has improved the silicon over the last few months. Micron announced in Q4 2025 that its HBM4 sample was running at speeds exceeding 11 Gbps per pixel, delivering more than 2.8 TB/s.

    Micron HBM4 is likely to enter mass production by calendar 2026. Micron has already secured multi-billion dollar agreements for HBM3E 2026. Major buyers, such as cloud giants and GPU manufacturers, are relying on Micron to be part of their supply chain in 2026. Micron can fill the gap if SK hynix is unable to meet the demand for Blackwell’s Memory or if Nvidia needs flexibility in the second-source.

    Pushing limits

    Senator wants Americans to get first dibs on GPUs and restrict sales to others

    READ MORE READ MORE

    Samsung is in a unique position in the HBM4 race and has to play catch-up. Samsung has been behind in the early HBM generations despite its manufacturing prowess

    Samsung’s problems become apparent with HBM3E. Samsung struggled with its 12-high stacks while SK hynix, Micron and SK hynix ramped up 8-high and 12 high HBM3E. It took 18 months and multiple attempts to meet Nvidia’s performance and quality criteria for HBM3E. Samsung’s 5th-generation HBM3E 12stack passed all tests by Q3 2025.

    Up until now, Samsung HBM was only used in AMD’s MI300 series accelerators. Nvidia has certified the 12-high HBM3E, and the company has agreed on a purchase of between 30,000 to 50,000 units for use in liquid cooled AI servers. Samsung’s HBM3E will also be shipping for AMD accelerators by mid-2025.

    One key challenge for the lag is that they tried to use the cutting-edge 1c process (a sixth generation 10nm node for their 12-stack HBM3E) but ran into yield issues. Pilot runs on 1c produced only 65 percent of the chips in July 2025. This is a major problem for mass production. Samsung had to recalibrate and revise its DRAM design, improve base dies, and enhance thermal control.

    Samsung plans to begin mass production of HBM4 during the first half 2026. In Q3 of 2025, Samsung began sending large quantities of HBM4 samples for early qualification to Nvidia. The company has another strategic ace in its sleeve: a deepening relationship with AMD (and OpenAI). In October 2025 news broke that AMD signed a deal to supply Instinct MI450 graphics cards to OpenAI. Samsung is reported to be the main supplier of HBM4 used in AMD’s MI450 Accelerator.

    The race to supply HBM4 will not be a zero-sum competition. All three vendors will be pushed beyond their limits to deliver high-performance memory modules to generative AI. The real winners are those who can overcome the technical challenges and deliver at scale.

    To benefit a larger market, it would be ideal if all three were successful. It would reduce the constraints and increase AI capability in the hands researchers and businesses. In any event, 2026 will be the year that decides this race in memory. It will be interesting for the AI product plans of those who bet on an “also-ran” to be altered if they don’t show up in volume. (r)

www.aiobserver.co

Exit mobile version