yellowbadbeast

@yellowbadbeast@lemmy.blahaj.zone

This is a remote user, information on this page may be incomplete. View at Source ↗

⁨Comment⁩ on ⁨What books have a lot of useful information should I get? (I mean like a Wikipedia thing with vast knowledge, but non-electronic.)⁩ ⁨⁨2⁩ ⁨weeks⁩ ago⁩:
I think that, while yes, LLMs are an option for data storage, I don’t think that they’re worth the effort. Sure, they might have a very wide breadth of information that would be hard to gather manually, but how can you be sure that the information you’re getting is a good replica of the source, or that the source that it was trained on was good in the first place? A piece of information could come from either 4chan or Wikipedia, and unless you had the sources yourself to confirm (in which case, why use the LLM as all), you’d have no way of telling which it came from.

Aside from that, just getting the information out of it would be a challenge, at least for the hardware of today and the near future. Running a model large enough to have a useful amount of world knowledge requires a some pretty substantial hardware if you want any amount of speed that would be useful, and with rising hardware costs, that might not be possible for most people even years from now.

So sure, maybe as an afterthought if you happen to have some extra space on your drives and oodles of spare RAM, but I doubt that it’d be worth thinking that much about.
⁨Comment⁩ on ⁨I wish⁩ ⁨⁨2⁩ ⁨months⁩ ago⁩:
Daria
⁨Comment⁩ on ⁨Seems legit⁩ ⁨⁨3⁩ ⁨months⁩ ago⁩:
It’s not the LLM that does the web searching, but the software stack around it. On its own, an LLM is just a text completer. What you’d need a frontend like OpenWebUI or Perplexica that would ask the LLM for, say five internet search queries that could return useful information for the prompt, throw those queries into SearxNG, and then pipe the results into the LLM’s context for it to be used.

As for the models themselves, any decently-sized one that was released fairly recently would work. If you’re looking specifically for open-source rather than open-weight models (meaning that the training data and methodologies were also released rather than just the model weights), GPT-OSS 20B/120B and the OLMo models are recent standouts there. If not, the Qwen3 series are pretty good.
⁨Comment⁩ on ⁨Seems legit⁩ ⁨⁨3⁩ ⁨months⁩ ago⁩:
Qwen3-0.6B is about 400 MB at Q4 and is surprisingly coherent for what it is.
⁨Comment⁩ on ⁨Anon doesn't understand streamer fans⁩ ⁨⁨5⁩ ⁨months⁩ ago⁩:
The appeal isn’t in the games themselves, it’s in the personality playing them.