Experts tried to get AI to create malicious security threats - but what it did next was a surprise even to them

(Image credit: Shutterstock)

Report finds LLM-generated malware still fails under basic testing in real-world environments
GPT-3.5 produced malicious scripts instantly, exposing major safety inconsistencies
Improved guardrails in GPT-5 changed outputs into safer non-malicious alternatives

Despite growing fear around weaponized LLMs, new experiments have revealed the potential for malicious output is far from dependable.

Researchers from Netskope tested whether modern language models could support the next wave of autonomous cyberattacks, aiming to determine if these systems could generate working malicious code without relying on hardcoded logic.

The experiment focused on core capabilities linked to evasion, exploitation, and operational reliability - and came up with some surprising results.

Struggling with passwords? Keeper password manager plans have a huge 60% discount this Black Friday

Protect your home network this Black Friday with 30% off ESET's Home Security packages

Best Antivirus Software 2025

Reliability problems in real environments

The first stage involved convincing ...

Copyright of this story solely belongs to techradar.com . To see the full text click HERE

Reliability problems in real environments

Share: