Chained bugs in Nvidia Triton Inference Server can compromise the entire system

August 5, 2025

Security Researchers have revealed a series of high-severity security vulnerabilities that could lead remote code execution (RCE), on Nvidia’s Triton Server.

Wiz Research stated that if these three vulnerabilities were successfully exploited, the potential outcomes could include AI model theft or data breaches, the manipulation of AI model responses or attackers moving to other areas of the system.

Nvidia now has patched bugs that affected Triton Inference Server. This open-source platform is used to run AI models and serve them to user facing apps. Nvidia designed Triton Inference Server to be able run models from any major AI Framework. It does this by using different backends that are dedicated to each framework. Triton’s Python backend is used by frameworks other that Python, making it the most versatile of the backends supported by the server. This greater reliance on Python could lead to a significant increase in the number of organizations that are affected.

CVE-2025-23320-7.5 is a bug that affects the Python backend. It is triggered when a large request exceeds the shared memory limit. This results in an error message which reveals the unique key (name) of the internal IPC shared-memory region of the backend.

By combining the unique memory region name with the public shared-memory API, attackers could take control of the Triton Inference Server.

An attack can exploit the API’s poor validation to exploit bugs that allow out-of bounds read and write – CVE-2025 23319 (8.1) or CVE-2025 23334 (5.9).

Triton accepts the attacker’s request for registration because the API does not check whether the key supplied by the attacker (the unique shared-memory name) corresponds with a legitimate user owned region or a privately-owned internal one. This allows the attacker to read and write from that region. Attackers can take full control of a server if they are able to manipulate the shared memory on the backend. “Ongoing” Ivanti hijack bug exploitation reaches clouds.

Google has a cloud infosec startup and a new unified platform.

Google is eyeing Microsoft’s $20B+ Security biz.

Ollama Drama as a ‘easy-to -exploit’ critical vulnerability found in open-source AI server.

Wiz would not confirm whether the bug chain was exploited. The team behind the findings () said

“This research demonstrates how a series of seemingly minor flaws can be chained together to create a significant exploit,” . “A verbose message in one component [and] can be used to compromise the system.

As companies deploy aa=””,

“We strongly recommend all Triton Inference Server users update to the latest version.”

=”” despite=””and=””been=””by=””collaboration=””excellent=””for=””has=””href=””https://www.theregister.com/2025/03/23/nvidia_dynamo/””like=””nvidia=””of=””organizations=””response .=”” Security=””several=””sizes ,=”” swift=””target=””_blank” “team=” (r)