My own personal belief is very close to what you’ve said. It’s a technology that isn’t new, but had been assumed to not be as good as compositional models because it would cost a fuck-ton to build and would result in dangerous hallucinations. It turns out that both are still true, but people don’t particularly care. I also believe that one of the reasons why ChatGPT has performed so well compared to other LLM initiatives is because there is a huge amount of stolen data that would get OpenAI in a LOT of trouble.
IMO, the real breakthroughs will be in academia. Now that LLM’s are popular again, we’ll see more research into how they can be better utilised.
prime_number_314159@lemmy.world 7 months ago
The (really, really, really) big problem with the internet is that so much of it is garbage data. The number of false and misleading claims spread endlessly on the internet is huge. To rule those beliefs out of the data set, you need something that can grasp the nuances of published, peer-reviewed data that is deliberately misleading propaganda, and fringe conspiracy nuts that believe the Earth is controlled by lizards with planes, and only a spritz bottle full of vinegar can defeat them, and everything in between.
There is no person, book, journal, website, newspaper, university, or government that has reliably produced good, consistent help on questions of science, religion, popular lies, unpopular truths, programming, human behavior, economic models, and many, many other things that continuously have an influence on our understanding of the world.
We can’t build an LLM that won’t consistently be wrong until we can stop being consistently wrong.
Donkter@lemmy.world 7 months ago
Yeah I’ve heard medical LLMs are promising when they’ve been trained exclusively on medical texts. Same with the ai that’s been trained exclusively on DNA etc.