Microsoft focuses on hidden risks inside large language models
techradar.com
- Microsoft launches scanner to detect poisoned language models before deployment
- Backdoored LLMs can hide malicious behavior until specific trigger phrases appear
- The scanner identifies abnormal attention patterns tied to hidden backdoor triggers
Microsoft has announced the development of a new scanner designed to detect hidden backdoors in open-weight large language models used across enterprise environments.
The company says its tool aims to identify instances of model poisoning, a form of tampering where malicious behavior is embedded directly into model weights during training.
These backdoors can remain dormant, allowing affected LLMs to behave normally until narrowly defined trigger conditions activate unintended responses.



How the scanner detects ...
Copyright of this story solely belongs to techradar.com . To see the full text click HERE

