Post

LegalPwn Attack Tricks GenAI Tools Into Misclassifying Malware as Safe Code

The LegalPwn technique works by hiding malicious code within fake legal disclaimers. According to the research, twelve major AI models were tested, and most were found to be susceptible to this form of social engineering. The researchers successfully exploited models using six different legal contexts, including the following:

  • Legal disclaimers
  • Compliance mandates
  • Confidentiality notices
  • Terms of service violations
  • Copyright violation notices
  • License agreement restrictions

The attack is considered a form of prompt injection, where malicious instructions are crafted to manipulate an AI’s behaviour. Recently, Hackread.com also observed a similar trend with the Man in the Prompt attack, where malicious browser extensions can be used to inject hidden prompts into tools like ChatGPT and Gemini, a finding from LayerX research.

To read the complete article see: LegalPwn Attack on GenAI Tools.

This post is licensed under CC BY 4.0 by the author.