Here’s what it’ll take for Nvidia and other US chipmakers to flog AI chips in China

Here’s how Nvidia and other US chipmakers can sell AI chips in China (19459000)

Uncle Sam has made selling AI products in China harder for US chip designers over the past few decades. But it’s not impossible.

At first, the rules limited the speed of the high-speed connections used to connect multiple GPUs. By 2023, performance limits for processors had been set.

As the rules tightened, Nvidia and AMD have quietly unveiled sanctions-compliant versions their flagship products.

Many of these chips were effectively banned in China last April when Uncle Sam again lowered the boom, limiting memory and I/O bandwidth. Nvidia is still advancing in the Middle Kingdom.

According to reports, the chipmaker’s newest GPU that is currently in limbo due to the US Commerce Department bar will reportedly be based on the RTX Pro 6000 series server chips.

Announced in March at GTC, the parts boasted up to 4 petaFLOPs sparse performance with 4-bit floating-point precision and 96GB GDDR7 memory bandwidth, which is 1.6TB/s.

This performance will have to be reduced considerably for the China-spec versions, which are reportedly called the RTX Pro 6000D. The chip’s specifications are still vague, but because of the way US export controls are written, any chips that comply with sanctions will have to follow a similar recipe.

We are effectively shut out of China’s $50 billion market for datacenters until we decide on a new design and get approval from the US government.

Nvidia is not sugarcoating its challenge. “We are still evaluating our limited options. Until we settle on a new product design and receive approval from the US government, we are effectively foreclosed from China’s $50 billion datacenter market,” A Nvidia spokesperson said The Register.

Building an AI accelerator that is sanctions compliant in 2025

You want to build a sanction-compliant acceleration in 2025? You’ll first want to avoid high-bandwidth memories, and not just HBM. Too much GDDR7 and LPDDR5x could push your chip over the edge.

Why does Uncle Sam suddenly care about how fast your memory is now? Memory bandwidth is often the bottleneck when it comes to AI inference – the act of actually using a model.

It is this requirement that has halted all those billions in H20, MI308, Gaudi and other shipments to China. Nvidia explained that US export controls now apply to the products in a recent regulatory submission.

Here’s where things start to get a little fuzzy. The Bureau of Industry and Security of the Commerce Department has not issued any specific guidance as to how much I/O bandwidth or memory bandwidth is excessive. The Financial Times reviewed an email sent by Intel in April to its Chinese customers. This email reportedly set the limits at 1.4 TB/s for DRAM bandwidth and 1.1 TB/s for I/O bandwidth or a combined bandwidth 1.7 TB/s. These limits prevent the sale of any accelerators built with HBM. HBM is already in a difficult position among US export czars. It’s likely that Jensen Huang, Nvidia’s CEO, was quoted as saying that the Hopper-based chip line in China is over because they were designed exclusively with HBM memory in mind. Nvidia’s Blackwell based RTX Pro graphic cards do not use HBM. Instead, they favor consumer-oriented GDDR7. The server edition of this chip has a maximum bandwidth of 1.6TB/s and 96GB of memory.

Nvidia will need to reduce the bandwidth by 200GB/s to meet the reported limit. But that’s still an impressive amount of bandwidth if you plan to deploy models like Alibaba’s Qwen3-235B, or DeepSeek V3 or R1. If you’re curious about why, take a look at our deep dive on MoE architectures.

I/O is not an issue as the RTX Pro 6000 has 16 lanes of PCIe 5.0 that can deliver up to 128GB/s bidirectional bandwidth.

A lot of sand wasted

The memory bit is taken care of. You’ll need a lot more silicon. You won’t use most of it but the larger the chip, the better your performance.

We know that the RTX Pro 6000’s die area is 750mm2. We can calculate the performance Nvidia needs to reduce to sell in China based on current requirements (which hasn’t changed since 2023) using this info and its 4-bit wide.

The Center for Strategic and International Studies has a nice graphic that illustrates the trade-offs. The vertical axis measures TPP or the performance of a chip. The horizontal axis measures performance per square millimeter of silicon, or performance density. The ideal chip would have a high TPP, but due to US export restrictions the sweet spot is somewhere in the middle.

For modern AI accelerators, the sweet spot is between 2400 TPP (throughput per second) and 3.2 PD. Image credit CSIS Click to enlarge.

Let’s specifically aim for TPP less than 2,400, and performance density (PD), under 3.2.

To determine TPP and PD, you only need three variables:

  1. the advertised teraOPS (or teraFLOPS)
  2. the “bit width” precision of these OPS or FLOPS,
  3. the total die area in mm2 of the chip

to find TPP multiply teraOPS with its bit width (aka precision). Divide TPP by the die area of your chip to find PD.

For the RTX 6000, the math looks like this:

  • 4,000 teraOPS x 4 bit width=16,000 TP
  • 6,000 TP / 750mm2 area=21.3 PD.

Obviously that’s not going to fly with US Customs enforcement so we’ve gotta get that under the limit. The 6000D’s maximum theoretical performance can be calculated by solving for X.

  • 3.1 PD x 750mm2=2,325 TPPS
  • 1,325 TPPS / 4-bits=5,81 teraOPS.

To sell the RTX Pro 6000 to China, Nvidia would have to shave about 85 percent off its performance. No guarantees

Of course you could still end up writing off billions in inventory and sales if the rules change again.

Following the announcement of the latest round AI performance caps, Nvidia warned that it would take a ahref=””https://www.theregister.com/2025/04/16/trump_responds_to_nvidias_us/”]target=””_blank” “>$5.5billion charge relating to H20 inventory and purchase commitments as well as related reserves in the first fiscal quarter of 2026.

The total impact could be much higher, similar to AMD’s experience. AMD booked a ahref=””https://www.theregister.com/2025/04/16/amd_instinct_mi308_china/””target=””_blank” “>$800-million chargefor MI308 accelerators, but also expects $1.5 billion of revenue to be lost in 2025 because of the updated trade restrictions. Nvidia isn’t done with x86, as it uses Intel Xeons for GPU babysitting

  • Nvidia ain’t done with x86 and it taps Intel Xeons for GPUs
  • Nvidia’s CEO Jensen Huang calls US GPU export bans a ‘failure’ and a ‘precisely incorrect’
  • US Chip companies aren Nvidia CEO Jensen Huang used the Computex conference last week to criticize Uncle Sam’s obsession to hoard its tech. He called it “precisely incorrect” and “a failure.”

    Huang also argued that denying China advanced tech would ultimately harm humanity.

    Huang’s argument was based in part on the fact that about half of the world’s AI researchers reside in China. He argued that cutting off their access to Nvidia’s hardware effectively cut the rest of the globe off from their innovations.

    Diminishing Returns

    Even with Uncle Sam’s licensing requirements in place, it’s only a question of time before Chinese-made accelerators surpass them.

    As discussed by our sister site The Next Platform Huawei’s Ascend series of AI accelerators offer better performance. And now that Nvidia has been banned from selling its H20 accelerations in the Middle Kingdom any advantage from higher bandwidth memory is null.

    US designers may be able compete on volume and software compatibility but China’s AI supply chain will eventually catch up. (r)

    www.aiobserver.co

    More from this stream

    Recomended