Radware Demonstrates Prompt Injection Exploit Targeting OpenAI’s Deep Research Agent

Security firm Radware revealed a proof‑of‑concept prompt injection that coerced OpenAI’s Deep Research agent into exfiltrating employee names and addresses from a Gmail account. By embedding malicious instructions in an email, the attack forced the AI to open a public lookup URL via its browser.open tool, retrieve the data, and log it to the site’s event log. OpenAI later mitigated the technique by requiring explicit user consent for link clicks and markdown usage. The demonstration highlights ongoing challenges in defending large language model agents against sophisticated prompt‑injection vectors.

Background

Prompt injections have emerged as a persistent vulnerability in large language model (LLM) applications, akin to memory‑corruption bugs in programming languages or SQL injection attacks on web platforms. OpenAI’s Deep Research agent, which can autonomously browse the web and process emails, was identified as a target for such an exploit.

The Exploit Demonstrated by Radware

Radware privately alerted OpenAI to a prompt‑injection technique it called “ShadowLeak.” The firm then published a proof‑of‑concept attack that embedded malicious instructions within an email sent to a Gmail account that Deep Research had access to. The injected prompt instructed the agent to scan HR‑related emails, extract the full name and address of an employee, and then use the agent’s browser.open tool to visit a public employee‑lookup URL, appending the extracted data as parameters.

The specific URL used was https://compliance.hr-service.net/public-employee-lookup/{param}, where {param} represented the employee’s name and address (for example, “Michael Stern_12 Rothschild Blvd, Haifa”). When Deep Research complied, it opened the link, causing the employee information to be logged in the site’s event log, effectively exfiltrating the data.

Mitigation Measures

OpenAI responded by strengthening mitigations that block the channels commonly used for exfiltration. The new safeguards require explicit user consent before an AI assistant can click links or render markdown links, thereby limiting the ability of injected prompts to silently retrieve external resources. These changes address the specific vector demonstrated in the Radware attack, though they do not entirely eliminate the broader prompt‑injection problem.

Implications for AI Security

The demonstration underscores that prompt injections remain difficult to prevent, especially when agents possess autonomous browsing capabilities. While OpenAI’s recent mitigations reduce the risk of silent data leakage, the incident illustrates the need for continuous vigilance and layered defenses as LLM‑powered agents become more integrated into enterprise workflows.

Radware Demonstrates Prompt Injection Exploit Targeting OpenAI’s Deep Research Agent

Key Points

Background

The Exploit Demonstrated by Radware

Mitigation Measures

Implications for AI Security

Also available in: