Tech »  Topic »  Perplexity AI accused of scraping content against websites’ will with unlisted IP ranges

Perplexity AI accused of scraping content against websites’ will with unlisted IP ranges


Perplexity, an AI search startup, has been spotted trying to disguise its content-scraping bots while flouting websites' no-crawl directives.

According to Cloudflare, a network infrastructure company that recently entered the bot gatekeeping business, Perplexity bots don't take no for an answer when websites say that they don't want to be scraped.

"Although Perplexity initially crawls from their declared user agent, when they are presented with a network block, they appear to obscure their crawling identity in an attempt to circumvent the website’s preferences," said Cloudflare engineers Gabriel Corral, Vaibhav Singhal, Brian Mitchell, and Reid Tatoris in a Monday blog post.

"We see continued evidence that Perplexity is repeatedly modifying their user agent and changing their source ASNs to hide their crawling activity, as well as ignoring — or sometimes failing to even fetch — robots.txt files."

A robots.txt file is a way for websites to tell web ...


Copyright of this story solely belongs to theregister.co.uk . To see the full text click HERE