Comment on ostris/Z-Image-De-Turbo · A De-distilled Z-Image-Turbo

<- View Parent
Even_Adder@lemmy.dbzer0.com ⁨16⁩ ⁨hours⁩ ago

It’s basically when you use a larger model to train a smaller one. You use a dataset of data generated by the teacher model and ground truth data to train the student model, and by some strange alchemy I don’t quite understand you get a much smaller model that resembles the teacher model.

source
Sort:hotnewtop