Comment on Instances that didn't block facebooks- thread.net
dan@upvote.au 10 months agoYou do realise that your account on here is public, right? Anyone can collect data like comments, and it’s likely already being used to train AI models just like Reddit data was.
The only way to avoid data being sent to other servers is by having a private account and never interacting with content from other servers. I don’t think it’s even possible to have a private Lemmy account.
Yoz@lemmy.world 10 months ago
So by your logic, you just want to hand out everything to Facebook on a golden platter. Let them scrape the data if they want but handing out willingly is like smoking cigarette only to get cancer.
dan@upvote.au 10 months ago
The data is already handed out willingly. Anyone can write code that federates with a Lemmy instance using the ActivityPub protocol, subscribe, and receive a feed of all posts and comments. The instance you’re on federates with around 5850 servers: lemmy.world/instances. Do you really think the admins have verified every one of them to ensure they’re legit?
Yoz@lemmy.world 10 months ago
I understand but the point that I am trying to make is Facebook is a data harvesting company and already got tools and algorithms in place. They can actually do some real damage when compared to a small company or a college student trying to scrape data for a project. Facebook isn’t know for helping people and making a better change so why risk it?
dan@upvote.au 10 months ago
Arguably all big tech companies do some sort of data harvesting though. Google is primarily an advertising and data collection company, and their data collection is more widespread than others - have you seen how many sites have Google Analytics on it, how many people use Android, and how many people use Gmail, Google Drive, etc? Apple allow data collection as long as it’s them doing it (hence trying to block third-parties from doing it - giving them an advantage).
If you’re worried about data harvesting, the real companies you need to worry about are companies like Acxiom/Liveramp, Experian, Datalogix, Neustar, etc. These are the companies that create profiles on you based on data they gather from a very large number of different sources (credit card data, supermarket reward programs, frequent flyer programs, mailers / TV ads you respond to, internet ads you click, things you buy online, etc) and sell them to advertisers. The big tech companies don’t do anything like that.
How can you be sure that only small companies or students are scraping Lemmy/Mastodon data today? One of those 5800 servers that federate with your Lemmy instance could be funneling data to a data analysis firm.