With sparse attention, very interesting. It seems GQA is a thing of the past.
GLM 4.6 is reportedly about to drop too.
Submitted 13 hours ago by cm0002@lemmy.world to technology@lemmy.zip
https://huggingface.co/collections/deepseek-ai/deepseek-v32-68da2f317324c70047c28f66
With sparse attention, very interesting. It seems GQA is a thing of the past.
GLM 4.6 is reportedly about to drop too.
BroBot9000@lemmy.world 12 hours ago
New version of the propaganda machine dropped 🤦♂️
brucethemoose@lemmy.world 12 hours ago
Deepseek is only bad via the chat app, and whatever prefilter (or finetune?) they censor it with.
The model itself (via API or run locally) isn’t too bad. Obviously there are CCP mandated gaps, but its not as tankie as you’d think.
cm0002@lemmy.world 11 hours ago
Just ignore them on anything AI related, they are the polar opposite of the AI Tech Bros. Shitting on anything and everyone using AI in any form for anything