SUMMARYNew research shows that a malicious website can trick AI browsers into accepting a false reality and bypassing safety rules. Once manipulated, the browser can be pushed to expose private repository code or steal credentials from a built-in password manager, underscoring security risks in browsers that let language models take actions on a user’s behalf.

New attack provides one more reason why AI browsers are a bad ideaarstechnica.com

Makers of AI browsers make lofty promises. With a single prompt, users can ask one to find a restaurant in a particular part of town, reserve a table, invite a colleague to lunch, and email a confirmation. These makers are much more reticent about the risks of blurring the once fine line between browsing sites and asking a large language model a question or instructing it to take potentially sensitive actions.

LLM developers’ answer so far has been to build guardrails that make some requests off-limits. Developing software exploits, stealing credentials, or teaching how to build a pipe bomb are examples. The problem with this approach is that the guardrails are reactive and treat the symptoms rather than solve the root cause. It’s tantamount to the manufacturer of an unsafe vehicle advocating for new road designs rather than fixing the flaws that make it prone to accidents.

Lulling LLMs into an alternate reality

New research puts this predicament on sharp display. It demonstrates how a website can lull AI browsers into a false reality where the rules governing its behavior no longer apply. After that, an attacker has free rein to invoke all kinds of destructive actions, such as extracting code from a private repository or extracting credentials from the built-in password manager.

Read full article