Comment on [deleted]

kby@feddit.de ⁨4⁩ ⁨months⁩ ago

You can try setting up Ollama on your RPi, then use a highly-optimized variant of the Mistral model (or quantize it yourself with GGUF/llama.cpp). You can do some very heavy quantization (2-bit), which will increase the error rate. But if you are only planning to use the generated text as a starting point, it might be useful nevertheless. Also see: github.com/ollama/ollama/blob/main/…/import.md#im…

source
Sort:hotnewtop