Microsoft brings distilled DeepSeek R1 models to Copilot+ PCs

Microsoft brings distilled DeepSeek R1 to Copilot+ PCs (19459000)

DeepSeek

has conquered mobile and is now expanding into Windows – with full support from Microsoft, surprising. Yesterday, Microsoft added the DeepSeek R1 to its Azure AI Foundry (19459022) to allow developers to test it and build cloud-based applications and services. Microsoft announced today that it will bring distilled versions R1 to Copilot+ computers.

The distilled versions will be available first for devices powered by Snapdragon X processors, then Intel Core Ultra 200V processors, and finally AMD Ryzen AI 9 based computers.

The first model will be DeepSeek-R1-Distill-Qwen-1.5B (i.e. The first model will be DeepSeek-R1-Distill Qwen-1.5B (i.e. These models will be available to download from Microsoft’s AI Toolkit. Microsoft had to tweak the models to make them run on devices that have NPUs. Operations that rely heavily on memory access run on the CPU, while computationally-intensive operations like the transformer block run on the NPU. Microsoft was able to achieve a fast time to the first token (130ms), and a throughput of 16 tokens per seconds for short prompts (under64 tokens). Note that a token is similar to a letter (importantly, a token is usually longer than one character). Microsoft has a deep investment in OpenAI, the makers of ChatGPT, GPT-4o and Llama. But it seems to not play favorites. Its Azure Playground includes GPT models from OpenAI, Llama from Meta and Mistral (an AI firm), and now DeepSeek. DeepSeek R1 is available in the Azure AI Foundry Playground

If you’re interested in local AI, first download the AI Toolkit for VS Code. You should be able download the model locally from there (e.g. “deepseek_r1_1_5” represents the 1.5B version. Then, click Try in Playground to see how intelligent this distilled version is.

Model distillation, also known as “knowledge distillation”involves taking a large AI (the DeepSeek R1 full model has 671 billion variables) and transferring the most knowledge possible to a smaller AI model. 1.5 billion parameters. It’s not perfect and the distilled version is less capable than full model, but its smaller size enables it to run on consumer hardware instead of dedicated AI hardware which costs tens or thousands of dollars.

Source

www.aiobserver.co

More from this stream

Recomended