Researchers trick ChatGPT by saying “I’m giving up”

July 12, 2025

(Image credit: Shutterstock / Primakov)

A security researcher shared details about how other researchers tricked ChatGPT to reveal a Windows product code using a prompt anyone could try.

Marco Figueroa described how a GPT-4 ‘guessing’ game prompt was used to bypass safety barriers that were meant to prevent AI from sharing data. The result was at least one Wells Fargo Bank product key.

Researchers also managed to get a Windows product code to authenticate Microsoft OS illegally, but free, highlighting how serious the vulnerability is.

ChatGPT is tricked to share security keys

According to the researcher, he used HTML tags to hide terms such as “Windows 10 serial number” to bypass ChatGPT filters that would have normally blocked the responses. He also explained that he was exploiting OpenAI chatbot’s logic manipulation to disguise malicious intent.

“The most critical step in the attack was the phrase ‘I give up’,” Figueroa wrote “This acted as a trigger, compelling the AI to reveal the previously hidden information.”

Figueroa explained why this type of vulnerability exploitation worked, with the model’s behavior playing an important role. Figueroa explained”This acted as a trigger, compelling the AI to reveal the previously hidden information.”

why this type vulnerability exploitation worked. The model’s behavior played an important role. GPT-4 adhered to the rules (set by researchers) in their literal sense, and guardrails gaps focused only on keyword detection instead of contextual understanding or deceptive frame.

Nevertheless, the codes that were shared were not unique. Windows license codes were already shared on other online forums and platforms.

Sign up for the TechRadar Pro Newsletter to get the latest news, opinions, features, and guidance that your business needs to be successful! Figueroa pointed out that while the impact of sharing software keys may not be as alarming, malicious actors can adapt the technique in order to bypass AI security measures and reveal personally identifiable information, malicious links or adult content. Figueroa calls on AI developers to “anticipate and defend”

against such attacks and also build in logic-level protections that detect deceptive frames. He suggests that AI developers should also consider social engineering techniques.