Comment

kby@feddit.de ⁨1⁩ ⁨year⁩ ago

You can try setting up Ollama on your RPi, then use a highly-optimized variant of the Mistral model (or quantize it yourself with GGUF/llama.cpp). You can do some very heavy quantization (2-bit), which will increase the error rate. But if you are only planning to use the generated text as a starting point, it might be useful nevertheless. Also see: github.com/ollama/ollama/blob/main/…/import.md#im…

source

Sort:hotnew top