Open-Weight AI Models Fail the Jailbreak Test

2 weeks, 3 days ago bankinfosecurity

Cisco: One Prompt May Not Break Most AI Models, But a Conversation Will Rashmi Ramesh (rashmiramesh_) • February 23, 2026

Cisco tested eight major open-weight artificial intelligence models and found multi-turn jailbreak attacks succeeded nearly 93% of the time. (Image: Shutterstock)

Enterprise artificial intelligence deployments are running on models that fold nearly every time under sustained adversarial pressure, researchers have found.

Cisco in its latest State of AI Security report tested eight open-weight large language models against multi-turn jailbreak attacks, which are sequences of iterative prompts designed to gradually steer a model into producing content its guardrails are meant to block. The attacks succeeded 92.78% of the time.

In single-turn tests, in which an attacker inputs a single prompt, success rates were considerably lower.

Open-weight models are AI systems whose underlying parameters are made publicly available, allowing developers to ...

Copyright of this story solely belongs to bankinfosecurity . To see the full text click HERE

Share: