Perplexity is sneaking onto websites to scrape blocked content, says Cloudflare
zdnet.com
- Cloudflare claimed Perplexity ignores websites' wishes in its content hunt.
- Cloudflare said other AI companies, such as OpenAI, don't wipe content.
- Cloudflare now offers services to block aggressive AI crawlers.
Cloudflare, a leading content delivery network (CDN) company, has accused the AI startup Perplexity of evading websites' "no crawl" directives by stealthily deploying web crawlers to scrape content from sites that have explicitly blocked its official bots.
If that sounds familiar, you've heard these accusations before. Last year, WIRED and Forbes both accused Perplexity of doing the same thing to their sites.
How Perplexity bypasses 'no crawl' directives
According to Cloudflare, when Perplexity's web crawler encountered a robots.txt file, which sites use to block their content from being crawled, Perplexity pretended to be an ordinary Chrome web browser on a Mac. This enabled it to bypass the bot barriers.
Also: Perplexity's ...
Copyright of this story solely belongs to zdnet.com . To see the full text click HERE