Comment

I think it’s inevitable.
I also think it won’t be possible on our current kind of computing hardware.

The software of the human mind, seems a byproduct of the structure of the human brain. I think a major revolution in processor design and manufacturing tech will be needed. It’ll need some fundamentally new form. Closer to an ASIC or FPGA processor to run with any kind of reasonable efficiency. But it’ll have to be truly 3 dimensional, not just layers of 2d processors. It’ll also need to be extremely low power.

LLMs are as close as we have right now, and they have miles to go. But they need hundreds of times more power than the brain does. No it won’t be soon and it won’t be with this kind silicon processors.

source

Sort:hotnew top

e0qdk@reddthat.com ⁨1⁩ ⁨month⁩ ago
I agree that the hardware being used right now is not well suited. I don’t agree that it’s strictly necessary to use the right hardware – there’s just less tedious waiting involved for the computation to happen if you’ve got better hardware. Real-time interaction is the boundary where you need to have good enough hardware. For everything else you just have to be patient enough – sometimes absurdly so, but you could, in principle, still perform the computation.

LLMs are as close as we have right now, and they have miles to go. But they need hundreds of times more power than the brain does. No it won’t be soon and it won’t be with this kind of silicon processors.

There are people already baking LLMs into custom hardware – e.g. chatjimmy.ai

Their demo page isn’t the best LLM I’ve seen (Qwen and Gemma are much more clever and more likely to give decent results) but this is a taste of what’s possible… It gives responses at ~17000 tokens a second today.

If I could get answers back from the best Qwen model I’ve got at that speed, I could just retry every query three times, feed it through another pass to self-assess the results, and then reply before you can blink. That would get rid of a lot of the “confidently claims knowledge about a made up subject” issue we currently see – we can do the same thing on CPUs/GPUs but you’re stuck waiting so long for the result that most people don’t bother.

source