Tech »  Topic »  Anthropic vs. OpenAI red teaming methods reveal different security priorities for enterprise AI

Anthropic vs. OpenAI red teaming methods reveal different security priorities for enterprise AI


Model providers want to prove the security and robustness of their models, releasing system cards and conducting red-team exercises with each new release. But it can be difficult for enterprises to parse through the results, which vary widely and can be misleading.

Anthropic's 153-page system card for Claude Opus 4.5 versus OpenAI's 60-page GPT-5 system card reveals a fundamental split in how these labs approach security validation. Anthropic discloses in their system card how they rely on multi-attempt attack success rates from 200-attempt reinforcement learning (RL) campaigns. OpenAI also reports attempted jailbreak resistance. Both metrics are valid. Neither tells the whole story.

Security leaders deploying AI agents for browsing, code execution and autonomous action need to know what each red team evaluation actually measures, and where the blind spots are.

What the attack data shows

Gray Swan's Shade platform ran adaptive adversarial campaigns against Claude models ...


Copyright of this story solely belongs to venturebeat . To see the full text click HERE