How exactly does a website stop a web scraper specifically from a org?
I mean isn’t that the whole point of web scraping? That if it’s publicly available, anybody, including people like ICE, will find a way to get the data?
Submitted 4 days ago by sabreW4K3@lazysoci.al to technology@beehaw.org
https://www.404media.co/mozilla-foundation-calls-on-tech-industry-to-block-ice-contractor/
How exactly does a website stop a web scraper specifically from a org?
I mean isn’t that the whole point of web scraping? That if it’s publicly available, anybody, including people like ICE, will find a way to get the data?
Yeah, it’s not technically impossible to stop web scrapers, but it’s difficult to have a lasting, effective solution. One easy way is to block their user-agent assuming the scraper uses an identifiable user-agent, but that can be easily circumvented. The also easy and somewhat more effective way is to block scrapers’ and caching services’ IP addresses, but that turns into a game of whack-a-mole. You could also have a paywall or login to view content and not approve a certain org, but that only will work for certain use cases, and that also is easy to circumvent. If stopping a single org’s scraping is the hill to die on, good luck.
That said, I’m all for fighting ICE, even if it’s futile. Just slowing them down and frustrating them is useful.
HappyFrog@lemmy.blahaj.zone 4 days ago
Fighting ICE automatically gives you 20 morality points.
Luffy879@lemmy.ml 4 days ago
Even a broken clock is right twice a day
moomoomoo309@programming.dev 3 days ago
Something that annoys me about people who love to harp on about how bad Mozilla is because they’ve gone downhill (which they have): Who is better? Genuinely compare them to their competition. Google? Heck no. Brave? Nope. Microsoft? Absolutely not. Apple? No. People complain about how much Mozilla spends on advocacy, but then when they actually do the advocacy, they’re happy about it! They’re perpetually stuck between a rock and a hard place because they’re pulled in both directions and thus, Firefox suffers. But, are they actually a broken clock? Really?