RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control

Submitted ⁨⁨1⁩ ⁨year⁩ ago⁩ by ⁨Even_Adder@lemmy.dbzer0.com⁩ to ⁨stable_diffusion@lemmy.dbzer0.com⁩

https://i.imgur.com/sg66Xhk.png

Abstract

We propose Reference-Based Modulation (RB-Modulation), a new plug-and-play solution for training-free personalization of diffusion models. Existing training-free approaches exhibit difficulties in (a) style extraction from reference images in the absence of additional style or content text descriptions, (b) unwanted content leakage from reference style images, and © effective composition of style and content. RB-Modulation is built on a novel stochastic optimal controller where a style descriptor encodes the desired attributes through a terminal cost. The resulting drift not only overcomes the difficulties above, but also ensures high fidelity to the reference style and adheres to the given text prompt. We also introduce a cross-attention-based feature aggregation scheme that allows RB-Modulation to decouple content and style from the reference image. With theoretical justification and empirical evidence, our framework demonstrates precise extraction and control of content and style in a training-free manner. Further, our method allows a seamless composition of content and style, which marks a departure from the dependency on external adapters or ControlNets.

Paper: arxiv.org/abs/2405.17401

Code: github.com/LituRout/RB-Modulation (coming soon)

Project Page: rb-modulation.github.io

Image

source

Comments

Sort:hotnew top

tagginator@utter.online [bot] ⁨1⁩ ⁨year⁩ ago
New Lemmy Post: RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control (https://lemmyverse.link/lemmy.dbzer0.com/post/21603938)
Tagging: #StableDiffusion
(Replying in the OP of this thread (NOT THIS BOT!) will appear as a comment in the lemmy discussion.)
I am a FOSS bot. Check my README: https://github.com/db0/lemmy-tagginator/blob/main/README.md
source