Comment on AI Training Slop
utopiah@lemmy.world 5 days agoI specifically asked about the training part, not the fine tuning but thanks to clarifying.
Comment on AI Training Slop
utopiah@lemmy.world 5 days agoI specifically asked about the training part, not the fine tuning but thanks to clarifying.
jfrnz@lemm.ee 5 days ago
The point is that OP (most probably) didn’t train it — they downloaded a pre-trained model and only did fine-tuning and inference.
utopiah@lemmy.world 5 days ago
Right, my point is exactly that though, that OP by having just downloaded it might not realize the training costs. They might be low but on average they are quite high, at least relative to fine-tuning or inference. So my question was precisely to highlight that running locally while not knowing the training cost is naive, ecologically speaking. They did clarify though that they do not care so that’s coherent for them. I’m insisting on that point because maybe others would think “Oh… I can run a model locally, then it’s not <<evil>>” so I’m trying to clarify (and please let me know if I’m wrong) that it is good for privacy but the upfront training cost are not insignificant and might lead some people to prefer NOT relying on very costly to train models and prefer others, or a even a totally different solution.
jfrnz@lemm.ee 5 days ago
The model exists already — abstaining from using it doesn’t make the energy consumption go away. I don’t think it’s reasonable to let historic energy costs drive what you do, else you would never touch a computer.
utopiah@lemmy.world 5 days ago
Indeed, the argument is mostly for future usage and future models. The overall point being that assuming training costs are negligible is either naive or showing that one does not care much for the environment.
From a business perspective, if I’m Microsoft or OpenAI, and I see a trend to prioritize models that minimize training costs, or even that users are avoiding costly to train model, I will adapt to it. On the other hand if I see nobody cares for that, or that even building more data center drives the value up, I will build bigger models regardless of usage or energy cost.
The point is that training is expensive and that pointing only to inference is like the Titanic going full speed ahead toward the iceberg saying how small it is. It is not small.