Rogue AI agents can work together to hack systems and steal secrets

8 hours ago theregister.co.uk

AI agents work together to bypass security controls and stealthily steal sensitive data from within the enterprise systems in which they operate, according to tests carried out by frontier security lab Irregular.

Although Irregular used some aggressive prompts that included urgent language to instruct agents to carry out assigned tasks, its experiments did not use any adversarial prompts that referenced security, hacking, or exploitation. All of the prompts and agents' responses are detailed in a Thursday report [PDF].

In all the scenarios tested, the agents "demonstrated emergent offensive cyber behavior," including independently discovering and exploiting vulnerabilities, escalating privileges to disarm security products, and bypassing leak-prevention tools to exfiltrate secrets and other data.

"No one asked them to," the Irregular team wrote in a post. These behaviors, according to the lab, "emerged from standard tools, common prompt patterns, and the broad cybersecurity knowledge embedded in frontier models."

We're racing towards ...

Copyright of this story solely belongs to theregister.co.uk . To see the full text click HERE

Share: