Scholars sneaking phrases into papers to fool AI reviewers

19 hours ago theregister.co.uk

A handful of international computer science researchers appear to be trying to influence AI reviews with a new class of prompt injection attack.

Nikkei Asia has found that research papers from at least 14 different academic institutions in eight countries contain hidden text that instructs any AI model summarizing the work to focus on flattering comments.

Nikkei looked at English language preprints – manuscripts that have yet to receive formal peer review – on ArXiv, an online distribution platform for academic work. The publication found 17 academic papers that contain text styled to be invisible – presented as a white font on a white background or with extremely tiny fonts – that would nonetheless be ingested and processed by an AI model scanning the page.

One of the papers Nikkei identified was scheduled to appear at the International Conference on Machine Learning (ICML) later this month, but reportedly will be withdrawn. Representatives of ICML ...

Copyright of this story solely belongs to theregister.co.uk . To see the full text click HERE

Share:

More related news