We need to do something to protect Internet Archive and its access to scrape sites.
Comment on Google Is the Only Search Engine That Works on Reddit Now, Thanks to AI Deal
termus@beehaw.org 3 months ago
Does this mean the Internet Archive will no longer be archiving reddit posts? That’s how I’ve tried viewing most since I deleted my accounts.
intensely_human@lemm.ee 3 months ago
OfficerBribe@lemm.ee 3 months ago
I honestly do not think Internet Archive even should be archiving such behemoths like Reddit or Twitter. Only thing it should keep would be currently dead sites.
Even worse when people are accessing these posts through Archive even when there is a live copy. A lot of storage and bandwidth wasted.
Onihikage@beehaw.org 3 months ago
Counterpoint: Scumbag companies ninja-editing their timestamped warranty page such that the only way you know they edited it after you bought the product is because it was archived previously.
neutronst4r@beehaw.org 3 months ago
But imagine this… an immoral rich human being, who’s family got rich by mining blood rubies in south Africa, buys reddit for 50B$. This person fires half the people and refuses to pay the bills for servers and the servers shut down… how will you access your favorite GoneWild posts? This is all fictional of course.
halm@leminal.space 3 months ago
…but at some point those giant sites may go offline. I see the point of archiving them now for posterity, but you’re right. The archive shouldn’t be used as a concurrent mirror of those sites for privacy reasons.
I have my browser set up to redirect Reddit links to libreddit instances for that purpose.
Kissaki@beehaw.org 3 months ago
How do you keep a currently dead website you did not previously archive?
OfficerBribe@lemm.ee 3 months ago
True, although I think there usually are either signs or site admins give heads up when site is soon to go under. Doubt Reddit or Twitter will be dead any time soon.