News

Nvidia and Microsoft accelerate AI on PCs

May 19, 2025

May 19, 2025 9:00AM

AI-PCs will get Tensor RT with RTX.

Image credit: Nvidia

Nvidia & Microsoft announced work to accelerate AI processing on Nvidia RTX AI PCs.

Generative artificial intelligence is transforming PC applications into new experiences, from writing assistants to intelligent agents and creative tools.

Nvidia RTX PCs power this transformation, with technology that makes it easier to experiment with generative AI and unlocks greater performance on Windows 11

TensorRT for RTX AIPCs

The TensorRT engine has been reimagined to work with RTX AIPCs. It combines industry-leading TensorRT performance, just-in time on-device engine creation and an 8x smaller packaging size for rapid AI deployment across the more than 100,000,000 RTX AIPCs.

Announced during Microsoft Build, Windows ML natively supports TensorRT for RTX — a new inference platform that offers app developers both broad hardware compatibility as well as state-of-the-art performance.

Gerardo Delgado said at a Nvidia press briefing, that the AI PCs begin with Nvidia RTX hardware and CUDA programming, as well as a variety of AI models. He said that an AI model, at its most basic level, is a collection of mathematical operations and a way to execute them. The combination of operations with the way they are run is what is known as a machine learning graph.

Then he added, “Our GPUs will execute these operations using Tensor cores.” Tensor cores are different from generation to generation. We have implemented them periodically, and within a GPU generation, you will also have different Tensor codes depending on the schema. The key to performance is matching the correct Tensor code to each mathematical operation. TensorRT achieves this in two steps.

Firstly, Nvidia must optimize the AI model. It quantizes the model to reduce the precision of some parts of the model, or layers. TensorRT consumes the optimized model that Nvidia created, and Nvidia then prepares a plan using a pre-selection kernels. TensorRT will be updated for RTX in order to improve the experience. It is designed specifically for RTX PCs, and provides the same TensorRT Performance. However, instead of pre-generating the TensorRT Engines per GPU, the focus will be on optimizing the models, and will ship a generic TensorRT Engine.

Once the application is installed TensorRT will generate the correct TensorRT engines for your specific GPU within seconds. “This greatly simplifies the developer’s workflow,” he said. Delgado stated that the results include a reduction in the size of libraries, improved performance for video production, and better quality of livestreams.

Nvidia’s SDKs allow app developers to easily integrate AI features into their apps and accelerate them on GeForce GPUs. This month, top software applications such as Autodesk Bilibili Chaos LM Studio Topaz and Topaz will release updates to unlock RTX acceleration and AI features.

AI developers and enthusiasts can easily get started using AI with Nvidia NIM. These pre-packaged AI models run in popular apps such as Microsoft VS Code, ComfyUI, and AnythingLLM. The FLUX.1 image generation model has been made available as a NIM. The popular FLUX.1 NIM was updated to support additional RTX GPUs. Project G-Assist, the RTX PC AI Assistant in the Nvidia App, has made it easy to create plug-ins for assistant workflows. New community plug ins are now available, including Google Gemini web searches, Spotify, Twitch IFTTT, and SignalRGB.

Accelerated AI inference using TensorRT on RTX

Today, AI PC software stacks require developers to choose between frameworks with broad hardware support and lower performance or optimized paths which only cover certain types of hardware or models and require the developer maintain multiple paths.

Windows ML’s inference framework was designed to overcome these challenges. Windows ML is built upon ONNX Runtime, and seamlessly connects with an optimized AI execution layer that each hardware manufacturer provides and maintains. Windows ML automatically uses TensorRT RTX for GeForce RTX graphics cards — an inference engine optimized for high performance. TensorRT is over 50% faster than DirectML for AI workloads running on PCs. Windows ML offers developers a better quality of life. It can automatically choose the right hardware for each AI feature and download the execution providers for that hardware. This eliminates the need to package these files into the app. Nvidia can now provide users with the latest TensorRT optimizations as soon as they’re ready. Windows ML is compatible with any ONNX-based model because it is built on ONNX runtime.

TensorRT for RTX has been reimagined to further enhance the developer experience. TensorRT uses on-device, just-in-time engine building instead of pre-generating TensorRT engines to optimize the AI model for the user’s RTX GPU. The library has also been streamlined to reduce its file size by eight times. TensorRT is available today to developers via the Windows ML Preview. It will also be available as a standalone SDK through Nvidia Developer in June.

Developers may find more information on the TensorRT launch blog and Microsoft’s Windows ML Blog.

Expanding the AI Ecosystem on Windows PCs

App developers looking to enhance app performance or add AI features can use a wide range of Nvidia SDKs. These include CUDA, TensortRT and RTX Video for GPU acceleration; DLSS, Optix and Maxine for 3D graphics; Riva or Nemotron for generative AI; and ACE or Riva for multimedia.

This month, top applications will release updates to enable Nvidia’s unique features. Topaz releases a generative AI model to improve video quality accelerated by CUDA. Chaos Enscape and Autodesk VRED add DLSS 4 to improve performance and image quality. BiliBili integrates Nvidia Broadcast, allowing streamers to activate Nvidia virtual background directly within Bilibili Livehime in order to enhance the quality livestreams.

Local AI made simple with NIM Microservices

Starting to develop AI on PCs is a daunting task. AI developers and enthusiasts must choose from over 1.2 millions AI models on Hugging Face. They also have to quantize them into a format which runs well on PCs, find and install the dependencies needed to run it and more. Nvidia NIM simplifies the process of getting started by offering a curated selection of AI models that are pre-packaged and optimized for RTX GPUs. The same NIM, as containerized microservices can be run across PC or cloud.

The NIM is pre-packaged with all the tools you need to run the model.

The NIM is already optimized for RTX GPUs and comes with an open-API compatible API that’s easy to use. It’s compatible with all the top AI applications users are using today.

Nvidia will release the FLUX.1 -schnell NIM at Computex — an image-generation model from Black Forest Labs that allows for fast image-generation — and update the FLUX.1 -dev NIM in order to add compatibility with a wide range GeForce RTX 40 and 50 Series GPUs. These NIMs provide faster performance when used with TensorRT and additional performance due to quantized models. These run twice as fast on Blackwell GPUs as they would natively. This is thanks to FP4/RTX optimizations.

Nvidia AI Blueprints are also available to help AI developers jumpstart their projects. These blueprints include sample workflows and NIM projects.

Nvidia released a 3D Guided Generative AI Blueprint last month, which allows users to control the composition and camera angle of generated images using a 3D model as a guide. Developers can modify or extend the open-source blueprint to suit their needs.

New Project G-Assist plugins and sample projects are now available

Nvidia released Project G-Assist recently as an AI assistant that is integrated into the Nvidia App. G-Assist allows users to control GeForce RTX systems using simple voice and texts commands. This is a more convenient interface than manual controls spread out across multiple legacy control panel. Developers can use Project G-Assist for easy plug-in development, testing assistant use cases, and publishing them via Nvidia Discord and Github.

Nvidia’s Plug-in Builder is a ChatGPT app that allows for no-code/low code development using natural language commands. These lightweight, community driven add-ons rely on simple JSON definitions with Python logic.

New samples that demonstrate how AI on devices can enhance PC and gaming workflows are now available on GitHub. * Gemini: Google’s free cloud-based LLM plug-in for Gemini has been updated with real-time web searching capabilities. IFTTT: Automate routines across digital and physical environments using the hundreds of IFTTT end points, including IoT systems and home automation systems. * Discord: Share game highlights or messages directly to Discord server without interrupting gameplay.

Browse the GitHub repository to see additional examples, including hands-free music controls via Spotify, livestream stats checks with Twitch and more.

Project G-Assist – AI Assistant for Your RTXPC

Businesses are also adopting AI to be the new PC interface. SignalRGB, for example, is developing a G-Assist plug-in that allows unified lighting control between multiple manufacturers. SignalRGB users can soon install this plug in directly from the SignalRGB application.

Those interested in developing and testing Project G-Assist Plug-ins can join the Nvidia Developer Discord Channel to collaborate, to share their creations and to receive support during development.

The RTX AI Garage series of blogs features community-driven AI innovation and content each week for those who want to learn more about NIM Microservices and AI Blueprints as well as building AI Agents, creative workflows and digital humans, productivity applications and more on AI workstations and PCs.

Daily insights into business use cases from VB Daily

Want to impress your boss? VB Daily can help. We provide you with the inside scoop on what companies do with generative AI. From regulatory shifts to practical implementations, we give you the insights you need to maximize ROI.

Read our privacy policy

Thank you for subscribing. Click here to view more VB Newsletters.

An error occured.

{{post_title}}

Nvidia and Microsoft accelerate AI on PCs

TensorRT for RTX AIPCs

Accelerated AI inference using TensorRT on RTX

Expanding the AI Ecosystem on Windows PCs

Local AI made simple with NIM Microservices

New Project G-Assist plugins and sample projects are now available

Project G-Assist – AI Assistant for Your RTXPC

NO COMMENTS

LEAVE A REPLY

Loading…

Here are the results for the search: "{{td_search_query}}"

No results!

{{post_title}}

TensorRT for RTX AIPCs

Accelerated AI inference using TensorRT on RTX

Expanding the AI Ecosystem on Windows PCs

Local AI made simple with NIM Microservices

New Project G-Assist plugins and sample projects are now available

Project G-Assist – AI Assistant for Your RTXPC

RELATED ARTICLES

LLMs Struggle to Act on What They Know: Google DeepMind Researchers...

Reinforcement Learning Makes LLMs Search-Savvy: Ant Group Researchers Introduce SEM to...

Critical Security Vulnerabilities in the Model Context Protocol (MCP): How Malicious...

NO COMMENTS

LEAVE A REPLY Cancel reply

LEAVE A REPLY