FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Submitted ⁨⁨2⁩ ⁨years⁩ ago⁩ by ⁨Even_Adder@lemmy.dbzer0.com⁩ to ⁨stable_diffusion@lemmy.dbzer0.com⁩

https://genforce.github.io/freecontrol/assets/teaser1.jpg

Overview

In this work, we present FreeControl, a training-free approach for controllable T2I generation that supports multiple conditions, architectures, and checkpoints simultaneously. FreeControl designs structure guidance to facilitate the structure alignment with a guidance image, and appearance guidance to enable the appearance sharing between images generated using the same seed. FreeControl combines an analysis stage and a synthesis stage. In the analysis stage, FreeControl queries a T2I model to generate as few as one seed image and then constructs a linear feature subspace from the generated images. In the synthesis stage, FreeControl employs guidance in the subspace to facilitate structure alignment with a guidance image, as well as appearance alignment between images generated with and without control.

Paper: arxiv.org/abs/2312.07536

Code: github.com/genforce/freecontrol (coming soon)

Project Page: genforce.github.io/freecontrol/

Controllable generation with T2I diffusion models.

Image

Any condition generation:

Image

source

Comments

Sort:hotnew top

tagginator@utter.online [bot] ⁨2⁩ ⁨years⁩ ago
New Lemmy Post: FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition (https://lemmy.dbzer0.com/post/10330263)
Tagging: #StableDiffusion
(Replying in the OP of this thread (NOT THIS BOT!) will appear as a comment in the lemmy discussion.)
I am a FOSS bot. Check my README: https://github.com/db0/lemmy-tagginator/blob/main/README.md
source