This is an automated archive made by the Lemmit Bot.
The original was posted on /r/opensource by /u/abrazilianinreddit on 2026-04-02 05:32:42+00:00.
This post is just to try to start the discussion around the usage of open-source code as training data on computational models, usually against the author’s desires.
I’m sure pessimists won’t care and say that big-tech companies won’t care about the license and use any public repositories as they wish, at least until a precedent is set in court.
Yet many book publishers and newspapers are suing AI companies, and often getting settlements as a result, meaning there’s solid case for violation of copyright in there.
Having a license that explicitly forbids usage of open-source projects by LLMs would definitely make lawyers sweat and companies fearful, much like how they detest GPL licenses - so what better way to do that than updating GPL3 or AGPL to our current situation? As a reminder, both licenses haven’t been changed since 2007.