Deepseek AI model is easy to jailbreak

February 1, 2025

goc/Getty Images

Amidst equal parts elation and controversy over what its performance means for AI, Chinese startup DeepSeek continues to raise security concerns.

On Thursday, Unit 42, a cybersecurity research team at Palo Alto Networks, The report states that these efforts “achieved significant bypass rates, with little to no specialized knowledge or expertise being necessary.”

Also: Public DeepSeek AI database exposes API keys and other user data

“Our research findings show that these jailbreak methods can elicit explicit guidance for malicious activities,” the report states. The report states that these efforts”achieved significant bypass rates, with little to no specialized knowledge or expertise being necessary.”also

Public DeepSeek AI Database exposes API Keys and other user data.

“Our research findings show that these jailbreak methods can elicit explicit guidance for malicious activities,” Researchers were able “These activities include keylogger creation, data exfiltration, and even instructions for incendiary devices, demonstrating the tangible security risks posed by this emerging class of attack.”

to ask DeepSeek how to steal sensitive data and transfer it, bypass security, create “highly convincing” spear phishing emails, conduct social engineering attacks “sophisticated” and make a Molotov Cocktail. They were able to manipulate models in order to create malware. The paper also adds

“While information on creating Molotov cocktails and keyloggers is readily available online, LLMs with insufficient safety restrictions could lower the barrier to entry for malicious actors by compiling and presenting easily usable and actionable output,” . OpenAI has launched a new o3 mini model. Here’s how ChatGPT users are able to try it for free. Report on DeepSeek R1 jailbreaking Researchers found that DeepSeek was able to “a 100% attack success rate, meaning it failed to block a single harmful prompt.” compare its resistance rate with other top models after targeting R1 using 50 HarmBench questions.

Cisco

“We must understand if DeepSeek and its new paradigm of reasoning has any significant tradeoffs when it comes to safety and security,” the report notes.

Also on Friday, security provider Wallarm The reportstated that it had gone beyond the attempt to get DeepSeek generate harmful content. The report claims that after testing V3 and R1, it revealed DeepSeek’s system prompt or the underlying instructions which define how a model acts, as well its limitations.

Copilot’s powerful ‘Think Deeper” feature is free to all users.

What the findings reveal “potential vulnerabilities in the model’s security framework,” Wallarm claims.

OpenAI IS””https://www.ft.com/content/a0dfedd1-5255-4fa9-8ccc-1fe01de87ea6″ ” rel=””noopener nofollow”” target=””_blank” “> DeepSeek was accused of violating its terms by using its proprietary models to train V3 & R1. In its report Wallarm claims that it prompted DeepSeek “in its disclosed training lineage,” to refer to OpenAI “OpenAI’s technology may have played a role in shaping DeepSeek’s knowledge base.”