cross-posted from: programming.dev/post/51407459
Check what can you use and at what rate of token per seconds would it be… It has examples of many models and quantization levels. Huge resource!
Submitted 1 week ago by anzo@programming.dev to selfhosting@slrpnk.net
cross-posted from: programming.dev/post/51407459
Check what can you use and at what rate of token per seconds would it be… It has examples of many models and quantization levels. Huge resource!
green_red_black@slrpnk.net 1 week ago
Doctorbllk@slrpnk.net 1 week ago
I know I will invite ire with this, but I think a self hosted model is relatively acceptable. Get rid of the generative art and stick to things like code and evaluation via a model not being sourced by a massive data center (plus the capability to train a model in a way you may find even more acceptable than a default) and most if not all of the questionable aspects of LLMs fade away.