AI is becoming introspective - and that 'should be monitored carefully,' warns Anthropic

Just_Super/E+/Getty Images

Follow ZDNET: Add us as a preferred source on Google.

ZDNET's key takeaways

Claude shows limited introspective abilities, Anthropic said.
The study used a method called "concept injection."
It could have big implications for interpretability research.

One of the most profound and mysterious capabilities of the human brain (and perhaps those of some other animals) is introspection, which means, literally, "to look within." You're not just thinking, you're aware that you're thinking -- you can monitor the flow of your mental experiences and, at least in theory, subject them to scrutiny.

The evolutionary advantage of this psychotechnology can't be overstated. "The purpose of thinking," Alfred North Whitehead is often quoted as saying, "is to let the ideas die instead of us dying."

Also: I tested Sora's new 'Character Cameo' feature, and it was borderline disturbing

Something similar might be happening beneath ...

Copyright of this story solely belongs to zdnet.com . To see the full text click HERE

ZDNET's key takeaways

Share: