SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Submitted ⁨⁨10⁩ ⁨months⁩ ago⁩ by ⁨Even_Adder@lemmy.dbzer0.com⁩ to ⁨stable_diffusion@lemmy.dbzer0.com⁩

https://cdn.prod.website-files.com/64f4e81394e25710d22d042e/672d1bcef3c3ec127e807901_672d1bab3c8e9f27c1a38e98_speed_demo.gif

TL;DR

A new post-training training quantization paradigm for diffusion models, which quantize both the weights and activations of FLUX.1 to 4 bits, achieving 3.5× memory and 8.7× latency reduction on a 16GB laptop 4090 GPU.

Paper: arxiv.org/abs/2411.05007

Weights: huggingface.co/mit-han-lab/svdquant-models

Code: github.com/mit-han-lab/nunchaku

Blog: hanlab.mit.edu/blog/svdquant

Project Page:

Demo: svdquant.mit.edu

Image

source

Comments

Sort:hotnew top