Comment on Why wordfreq will not be updated - AI spam

tal@lemmy.today ⁨2⁩ ⁨months⁩ ago

wordfreq is not just concerned with formal printed words. It collected more conversational language usage from two sources in particular: Twitter and Reddit.

Now Twitter is gone anyway, its public APIs have shut down,

Reddit also stopped providing public data archives, and now they sell their archives at a price that only OpenAI will pay.

There’s still the Fediverse.

I mean, that doesn’t solve the LLM pollution problem, but…

source
Sort:hotnewtop