Comment on AI crawlers destroying websites in hunger for content
fuckwit_mcbumcrumble@lemmy.dbzer0.com 1 week ago
Moreover, AI crawlers are much more aggressive than standard crawlers. As the InMotionhosting web hosting company notes, they also tend to disregard crawl delays or bandwidth-saving guidelines and extract full page text, and sometimes attempt to follow dynamic links or scripts.
So they’re just lazily programmed crawlers. Ironically trying to block them can cause web traffic to go up not down when people use more advanced methods to get around blocking. When you switch from a simple wget command ripping the bare page to a full blown chrome browser loading all the pictures, JS, and other junk that shit adds up.