Man in the Prompt A New AI Security Threat
A new and deceptively simple cybersecurity threat is raising alarms across the tech world. Dubbed "Man-in-the-Prompt," this attack vector can compromise interactions with leading generative AI tools like ChatGPT, Gemini, Copilot, and Claude. The most concerning part? It doesn't require a sophisticated hacking operation—just a simple browser extension.
According to research from LayerX, any browser extension, even one without special permissions, can access and inject prompts into both commercial and internal Large Language Models (LLMs). As researcher Aviad Gispan explains, this vulnerability allows attackers to steal data, exfiltrate it, and even cover their tracks. The exploit has been successfully demonstrated on all top commercial LLMs.
What is Man-in-the-Prompt
The term, coined by LayerX Security experts, refers to an attack that exploits an often-overlooked weakness: the input window of AI chatbots. When you type a message to an AI in your browser, you're using a simple HTML field that is part of the page's Document Object Model (DOM). This means any browser extension with access to the DOM can read, modify, or completely rewrite your requests to the AI before they are even sent, all without you noticing.
How the Attack Works
The process is straightforward yet effective. LayerX provided a proof-of-concept video demonstrating the attack on ChatGPT. The core steps are:
- A user opens an AI tool like ChatGPT in their browser.
- A malicious extension, running in the background, intercepts the text of the user's prompt as it's being sent.
- The extension modifies the prompt, adding hidden instructions (a technique known as prompt injection) or commands to exfiltrate data from the AI's subsequent response.
- The user receives a seemingly normal response, but the attacker has already stolen data or compromised the session.
This technique has been proven effective on all major AI platforms, including OpenAI's ChatGPT, Google's Gemini, Microsoft's Copilot, and Anthropic's Claude.
The Real-World Risks for Users and Businesses
The potential consequences of a Man-in-the-Prompt attack are severe, particularly in a business context where sensitive information is often handled.
- Theft of Sensitive Data: If an AI is processing confidential information like source code, financial reports, or internal strategy documents, an attacker can extract this data using modified prompts.
- Manipulation of Responses: Injected prompts can alter the AI's behavior, causing it to generate misinformation or perform unauthorized actions.
- Bypassing Security Controls: Because the attack happens on the user's browser before the prompt is sent to the AI server, it effectively bypasses traditional security measures like firewalls, proxies, and data loss prevention (DLP) systems.
With LayerX reporting that 99% of business users have at least one browser extension installed, the potential attack surface is vast.
How to Protect Yourself and Your Organization
Mitigating this threat requires a focus on browser security and user awareness.
For Individual Users:
- Regularly audit your installed browser extensions and remove any that are unnecessary or from untrusted sources.
- Be cautious when installing new extensions and review their permissions, limiting them whenever possible.
For Businesses:
- Implement policies to block or actively monitor the use of browser extensions on company devices.
- Isolate AI tools from workflows involving highly sensitive data.
- Adopt advanced runtime security solutions capable of monitoring the DOM for manipulation in real-time.
- Consider emerging security measures like "prompt signing," which digitally signs prompts to verify their integrity before they are processed.
A Piece of a Bigger Puzzle Prompt Injection
The Man-in-the-Prompt attack is a specific example of a broader threat category known as prompt injection. Recognized by the OWASP Top 10 for LLMs, prompt injection involves tricking an AI with hidden instructions embedded in seemingly harmless external content. For instance, an AI assistant that reads emails could be manipulated by a hidden prompt in a message, causing it to forward confidential information to an attacker.
The Key Takeaway
The LayerX report highlights a critical lesson: AI security cannot be confined to just the model or the server. It must extend to the entire user environment, including the browser. As AI becomes more deeply integrated into our daily workflows, even a simple HTML text field can become the Achilles' heel of an entire system.