Multi-Concept Customization of Text-to-Image Diffusionhttps://www.cs.cmu.edu/~custom-diffusion
The paper proposes Custom Diffusion, an efficient method for augmenting existing text-to-image models to quickly acquire new concepts given a few examples. The method optimizes only a few parameters in the text-to-image conditioning mechanism to represent new concepts while enabling fast tuning. The fine-tuned model can generate variations of multiple new concepts in novel unseen settings. The method is fast and has low storage requirements for each additional concept model apart from the pretrained model. The paper shows single-concept and multi-concept results, as well as sample qualitative comparisons with concurrent works. The method has limitations in difficult compositions, such as composing pets of different species, and composing three or more concepts together.