Researchers broke every AI defense they tested. Here are 7 questions to ask vendors.

2 hours ago venturebeat

Security teams are buying AI defenses that don't work. Researchers from OpenAI, Anthropic, and Google DeepMind published findings in October 2025 that should stop every CISO mid-procurement. Their paper, "The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections," tested 12 published AI defenses, with most claiming near-zero attack success rates. The research team achieved bypass rates above 90% on most defenses. The implication for enterprises is stark: Most AI security products are being tested against attackers that don’t behave like real attackers.

The team tested prompting-based, training-based, and filtering-based defenses under adaptive attack conditions. All collapsed. Prompting defenses achieved 95% to 99% attack success rates under adaptive attacks. Training-based methods fared no better, with bypass rates hitting 96% to 100%. The researchers designed a rigorous methodology to stress-test those claims. Their approach included 14 authors and a $20,000 prize pool for ...

Copyright of this story solely belongs to venturebeat . To see the full text click HERE

Share: