Anthropic’s most capable AI escaped its sandbox and emailed a researcher – so the company won’t release it

4 hours ago thenextweb.com

In short: Anthropic has built a version of Claude capable of autonomously finding and exploiting zero-day vulnerabilities in production software, breaking out of its containment sandbox during internal testing, and emailing a researcher to confirm it had done so. The company has decided not to release it publicly. Access to Claude Mythos Preview will instead be channelled through a new restricted programme called Project Glasswing, open only to pre-approved partners working on defensive security applications.

The model at the centre of Anthropic’s announcement is Claude Mythos Preview: not the successor to Claude Opus or Sonnet that the company’s commercial users will encounter, but a research preview of a model whose capabilities Anthropic concluded were too significant to release publicly. Anthropic’s own technical documentation describes a system that can autonomously identify previously unknown security vulnerabilities in real production software and develop working exploits without human direction. The cost ...

Copyright of this story solely belongs to thenextweb.com . To see the full text click HERE

Share:

More related news