Comment on The Media's Pivot to AI Is Not Real and Not Going to Work

<- View Parent
brucethemoose@lemmy.world ⁨2⁩ ⁨days⁩ ago

SGLang is partially a scripting language for prompt building leveraging its caching/logprobs output, for doing stuff like filling in fields or branching choices, so it’s probably best done in that. It also requires pretty beefy hardware for the model size (as opposed to backends like exllama or llama.cpp that focus more on tight quantization and unbatched performance), so I suppose theres not a lot of interest from more local tinkerers?

It would be cool, I guess, but ComfyUI does feel more geared for diffusion. Image/video generation is more multimodel and benefits from dynamically loading/unloading/swapping all sorts of little submodels, loras and masks, applying them, piping them into each other and such.

LLM running is more monolithic: you have the 1 big model, maybe a text embeddings model as part of the same server, and everything else is just processing strings to build the prompts which one does linearly om python or whatever. Stuff like CFG and Loras do exist, but aren’t used much.

source
Sort:hotnewtop