Consistency LLM: converting LLMs to parallel decoders accelerates inference 3.5x

⁨0⁩ ⁨likes⁩

Submitted ⁨⁨1⁩ ⁨year⁩ ago⁩ by ⁨bot@lemmy.smeargle.fans [bot]⁩ to ⁨hackernews@lemmy.smeargle.fans⁩

https://hao-ai-lab.github.io/blogs/cllm/

Comments

Sort:hotnew top