Supercomputers once required large power plants to operate, and now we carry around computing devices in out pockets that are more powerful than those supercomputers.
There’s plenty of room to further shrink the computers, simplify the training sets, formalize and optimize the training algorithms, and add optimized layers to the AI compute systems and the I/O systems.
But at the end of the day, you can either simplify or throw lots of energy at a system when training.
Just look at how much time and energy goes into training a child… and it’s using a training system that’s been optimized over hundreds of thousands of years (and is still being tweaked).
AI as we see it today (as far as generative AI goes) is much simpler, just setting up and executing probability sieves with a fancy instruction parser to feed it its inputs. But it is using hardware that’s barely optimized at all for the task, and the task is far from the least optimal way to process data to determine an output.
Nibodhika@lemmy.world 10 hours ago
Your answer is intuitively correct, but unfortunately has a couple of flaws
They didn’t, not that much anyways, a Cray-1 used 115kW to produce 160 MFLOPS of calculations. And while 150kW is a LOT, it’s not in the “needs its own power plant to operate” category, since even a small coal power plant (the least efficient electricity generation method) would produce a couple of orders of magnitude more than that.
Indeed, our phones are in the Teraflops range for just a couple of watts.
Unfortunately there isn’t, we’ve reached the end of Moore’s law, processors can’t get any smaller because they require to block electrons from passing on given conditions, and if we built transistors smaller than the current ones electrons would be able to quantum leap across them making them useless.
There might be a revolution in computing by using light instead of electricity (which would completely and utterly revolutionize computers as we know them), but until that happens computers are as small as they’re going to get, or more specifically they’re as space efficient as they’re going to get, i.e. to have more processing power you will need more space.