Comment on ostris/Z-Image-De-Turbo · A De-distilled Z-Image-Turbo
Even_Adder@lemmy.dbzer0.com 16 hours agoIt’s basically when you use a larger model to train a smaller one. You use a dataset of data generated by the teacher model and ground truth data to train the student model, and by some strange alchemy I don’t quite understand you get a much smaller model that resembles the teacher model.
scrubbles@poptalk.scrubbles.tech 13 hours ago
Thanks for explaining!