Alibaba to stop relying on Nvidia's AI inference

Alibaba is said to have developed an AI accelerator in response to increasing pressure from Beijing, which wants the nation to reduce its reliance on Nvidia graphics cards. The Wall Street Journal reported on Friday that Alibaba’s latest chip was aimed at AI inference. This refers to serving rather than training models. Alibaba’s T-Heat division is working on AI silicon since some time. In 2019, it launched Hanguang 800. The part, unlike the modern chips from Nvidia or AMD, was primarily designed for conventional machine learning models such as ResNet, and not the large language models and diffusion models used to power AI chatbots and images generators today.

It’s reported that the new chip will be able handle a wider range of workloads. Alibaba, with its Qwen3 model family launched in April, has become a leading developer of open models. It’s not surprising that its initial focus was on inference. It is easier to train models than to serve them, so it’s a good place to begin the transition to its own hardware. Alibaba will likely continue to use Nvidia accelerators as a model training tool for the foreseeable.

The Journal reports that, unlike Huawei’s Ascend NPUs, Alibaba’s chip will be compatible Nvidia’s software platform. This allows engineers to repurpose their existing code. This might sound like CUDA, Nvidia’s low-level GPU programming language. However, this is unlikely and not necessary for inference.

It is more likely that Alibaba is aiming for higher-level abstractions like TensorFlow or PyTorch, which provide a programming interface that is hardware-agnostic. We say largely because there are still many PyTorch libraries that are built for Nvidia hardware. However, projects like Triton addressed most of these edge cases.

The chip will have to be made domestically because of US export controls for semiconductor tech, which prevents many Chinese companies from doing a deal with TSMC and Samsung Electronics.

Although the report does not specify which company will be responsible for fabing the chip, if we were to guess, we would say it is China’s Semiconductor Manufacturers International Co. SMIC is also the fab that produces Huawei’s Ascend NPU family.

Manufacturing challenges aren’t China’s only challenge. AI accelerators require large quantities of fast memories. This is usually high bandwidth memory (HBM), but it’s also restricted due to the US-China trade war. HBM2e or newer cannot be sold in China without being attached to a processor.

This means Alibaba’s chip will rely on slower GDDR and LPDDR memory or existing stockpiles HBM3 or HBM2e or older HBM2, that isn’t limited, until Chinese vendors are ready fill the void.

The news of the homegrown silicon comes at a time when the Chinese government is pressuring tech titans to not use Nvidia H20 accelerators, while also stoking concerns about backdoors and remotely controlled kill switches. Nvidia has denied that such features exist. The company was recently cleared to resume shipping of H20s into China.

We reported this week that while Nvidia was given the green light to resume the shipment of the chip, it does not expect to see revenues in the region during the current quarter as the company waits for Uncle Sam’s to navigate the maze-like red tape necessary to implement a 15% export tax on AI processors bound for China.

Nvidia’s imminent return to China hasn’t prevented many of China’s AI flag-bearers from looking for alternative solutions. DeepSeek’s market-shaking models were retuned to run on the latest generation of domestic silicon earlier this month.

DeepSeek did not identify the chip supplier, but the company reportedly failed to transfer models to Huawei’s Ascend Accelerators. Nvidia, the AI arms dealer, laments the billions of dollars lost in the US-China trade conflict

DeepSeek V3.1’s release indicates that powerful new Chinese chips are coming soon

DeepSeek’s new V3.1 releases points to the arrival of new Chinese chips

Alibaba & Huawei aren’t alone in their efforts to end China’s dependence on Western silicon. Last month, EE Times China reported that Tencent-backed startup Enflame is developing a new AI processor called the L600. The chip would feature 144GB on-chip memory with 3.6Tb/s bandwidth. MetaX has revealed its C600, which will feature 144GB HBM3e. It appears that the existing stockpiles HBM3e may limit chip production.

Lastly, Cambricon is another AI chip hopeful, seen by some as China’s Nvidia. They are also working to develop a home-grown accelerator called the Siyuan 690, which is widely expected

to outperform Nvidia’s now three-year old H100 accelerators. The Register contacted Alibaba for a comment. We’ll let you all know if there is any response. (r)

Alibaba to stop relying on Nvidia’s AI inference

The AI lab revolving door spins ever faster

Flutterwave goes deeper into stablecoins with Turnkey-powered wallets for merchants

Sophos Launches Browser-Based Security Product Targeting Hybrid Work & AI Risks

Razer’s Project Ava: AI now goes in a cannister on your...

Recomended

The AI lab revolving door spins ever faster

Flutterwave goes deeper into stablecoins with Turnkey-powered wallets for merchants

Sophos Launches Browser-Based Security Product Targeting Hybrid Work & AI Risks

Razer’s Project Ava: AI now goes in a cannister on your desk

Tech Careers in 2026 and Beyond: Inside the Jobs, Skills, and Roles Defining Africa’s Digital Future

OpenAI invests in brain-interface biz co-founded by CEO Sam Altman