Publishers say no to AI scrapers, block bots at server level
theregister.co.ukA growing number of websites are taking steps to ban AI bot traffic so that their work isn't used as training data and their servers aren't overwhelmed by non-human users. However, some companies are ignoring the bans and scraping anyway.
Online traffic analysis conducted by BuiltWith, a web metrics biz, indicates that the number of publishers trying to prevent AI bots from scraping content for use in model training has surged since July.
About 5.6 million websites presently have added OpenAI's GPTBot to the disallow list in their robots.txt file, up from about 3.3 million at the start of July 2025. That's an increase of almost 70 percent.
Websites can signal to visiting crawlers whether they allow automated requests to harvest information through entries in their robots.txt files. Compliance with these directives is voluntary, but repeated failure to respect these rules may ...
Copyright of this story solely belongs to theregister.co.uk . To see the full text click HERE

