Comment on Advice - Getting started with LLMs

<- View Parent
BaroqueInMind@lemmy.one ⁨7⁩ ⁨months⁩ ago

OLlama is so fucking slow. Even with a 16-core overclocked Intel on 64Gb RAM with an Nvidia 3080 10Gb VRAM, using a 22B parameter model, the token generation for a simple haiku takes 20 minutes.

source
Sort:hotnewtop