Comment on Bill proposed to outlaw downloading Chinese AI models.
p03locke@lemmy.dbzer0.com 2 weeks agoThere are several “good” LLMs trained on open datasets like FineWeb, LAION, DataComp, etc.
Then use those as training data. You’re too caught up on this exacting definition of open source that you’ll completely ignore the benefits of what this model could provide.
an LLM could decide to, for example, summarize and compress some context full of trade secrets, then proceed to “search” for it, sending it to wherever it has access to.
That’s not how LLMs work, and you know it. A model of weights is not a lossless compression algorithm.
Also, if you’re giving an LLM free reign to all of your session tokens and security passwords, that’s on you.
jarfil@beehaw.org 2 weeks ago
piratewires.com/…/compression-prompts-gpt-hidden-…
There are more trade secrets than session tokens and security passwords. People want AI agents to summarize their local knowledge base and documents, then expand it with updated web searches. No passwords needed when the LLM can order the data to be exfiltrated directly.