OpenAI has acknowledged that prompt injection attacks remain a long-term security challenge for AI-powered browsers, even as the company strengthens defenses around its ChatGPT Atlas browser.

In a blog post published on Monday, OpenAI said prompt injection — a technique in which attackers embed malicious instructions into web pages, documents, or emails to manipulate AI agents — is unlikely to be fully eliminated.

“Prompt injection, much like scams and social engineering on the web, is unlikely to ever be fully ‘solved,’” the company said, adding that Atlas’ “agent mode” expands the overall security threat surface.

OpenAI launched the Atlas browser in October, and security researchers quickly demonstrated vulnerabilities that allowed hidden instructions in documents, such as Google Docs, to alter the browser’s behaviour. Cybersecurity firm Brave has also warned that indirect prompt injection poses a systemic risk to AI-driven browsers, including rival products.

The warning echoes recent guidance from the UK’s National Cyber Security Centre, which said earlier this month that prompt injection attacks against generative AI systems may never be fully mitigated and advised organisations to focus on reducing their impact rather than trying to eliminate them entirely.

To address the threat, OpenAI said it is adopting a rapid-response security approach and has developed an automated attacker powered by reinforcement learning to simulate real-world hacking attempts. The system is designed to identify vulnerabilities internally before they are exploited externally.

According to OpenAI, the automated attacker has uncovered attack strategies that were not detected during human-led testing or reported by external researchers. In one demonstration, the attacker embedded malicious instructions in an email that caused an AI agent to send a resignation message instead of drafting an out-of-office reply. OpenAI said updated safeguards were able to detect and block the attack.

An OpenAI spokesperson declined to say whether the new measures have measurably reduced successful prompt injection attempts but said the company has worked with third parties on security improvements since before Atlas’ launch.

Rami McCarthy, a principal security researcher at cybersecurity firm Wiz, said reinforcement learning can help adapt to evolving threats but cautioned that agentic browsers carry inherent risks.

“Agentic browsers tend to combine moderate autonomy with very high access,” McCarthy said, noting that such access to email and payment systems increases exposure even as it enables powerful functionality.

OpenAI recommends that users limit agent access, require confirmation for sensitive actions, and provide narrowly defined instructions to reduce the risk of malicious influence.