Local runs on device, so no need to connect to a big data center that chugs lots of water and all those other problems. Of course, because it’s a smaller far tinier model it’s nowhere near as accurate, but especially for things like this you don’t really need a big accurate LLM model.
I think I also though I should warrant a disclaimer that I am a Software Developer, not a AI Developer. So there’s far less backing then from my perspective than someone who works with this stuff for a living
Cethin@lemmy.zip 2 days ago
I don’t know, but I’m willing to bet that economies of scale actually mean data centers are more efficient. This isn’t to say their use is justified, just that they’re able to take advantage of things a home computer can’t.
However, having to run it locally means it needs to be much more limited. This is doubly true if you want to run the game and the LLM at the same time. The LLM is easily able to consume all resources your system has available if you allow it to, which means the game won’t run well (if it runs at all). This limits the use so it can’t just be shoved everywhere and constantly running, like it could if it’s sent to a data center. It’s not more efficient, just less consumption.
SabinStargem@lemmy.today 1 day ago
On my system, I can play a RPG Maker game and use a 122b LLM at the same time, alongside to a podcast. A model in that parameter range takes up about 70gb of DDR4 RAM and 36gb of VRAM. However, it used to be that a 120b AI would take a larger footprint, bringing the system to the brink. The hardware requirements are going down, and the quality also increased, alongside speed. I believe when the next major sea change of hardware happens, AI will become very practical for gaming.
Cethin@lemmy.zip 1 day ago
Damn, your system is insane. Yeah, an RPG maker game is next to nothing compared to that. Still, Dragon Quest I think is 3D. It takes a lot more VRAM than RPG maker.
I have 16GB VRAM, which is a lot for most systems. That’s easily consumed by an LLM. Any model that doesn’t use at least that much tends to perform pretty poorly, in my experience. That’s not mentioning how much heat it generates while running, which has to be removed from the system or it’ll slow down. Even if your system can handle it, it heats up fast. It’s great when I need a heater running, but when I need AC my room gets warm quick.
SabinStargem@lemmy.today 1 day ago
Keep in mind, a 122b (Qwen3.5 family), is high end for consumer machines, but it is likely that DQX would be using a much smaller model. Currently, we have Qwen models that are 8b, 9b, 27b, 35b, 122b, and 397b. Plus, ‘quanting’ can reduce how much memory a model takes up - at a tradeoff, o’course. I am guessing DQX would have multiple local models, and use the player’s hardware metrics to decide which model to deploy.
Alternatively, the Chatty Slime could rely on cloud AI. Depending on Square’s strategy, that could be a freebie or a paid service. If the Chatty Slime gave options to the player - say, trading a potion for a stat seed, or responding to a quiz, Square could sell player behavior data.
…Anyhow, my room has a mini-split AC. One of the best purchases in my life: my room lacked insulation in the first place, so it becomes toasty during summer. The side effect is being able to just run my GPU and not become a human slushy.