The article discusses the challenges faced by large-scale text-to-image generation models in synthesizing high-quality images with novel concepts. Current attempts to teach these models new concepts have the drawback of overfitting to given reference images. To address this, the article proposes DreamArtist, which uses a positive-negative prompt-tuning strategy to train both positive and negative embeddings. The positive embedding captures the reference image's characteristics, while the negative embedding rectifies inadequacies from the positive embedding. The proposed method achieves superior generation performance over existing methods and is effective for more applications, including concept compositions and prompt-guided image editing.
This repo is the official PyTorch implementation of "DreamArtist: Towards Controllable One-Shot Text-to-Image Generation via Contrastive Prompt-Tuning" with Stable-Diffusion-webui.
With just one training image DreamArtist learns the content and style in it, generating diverse high-quality images with high controllability. Embeddings of DreamArtist can be easily combined with additional descriptions, as well as two learned embeddings.