Versatile Diffusionhttps://shi-labs-versatile-diffusion.hf.space/?__theme=light
Versatile Diffusion (VD), the first unified multi-flow multimodal diffusion framework, as a step towards Universal Generative AI. VD can natively support image-to-text, image-variation, text-to-image, and text-variation, and can be further extended to other applications such as semantic-style disentanglement, image-text dual-guided generation, latent image-to-text-to-image editing, and more. Future versions will support more modalities such as speech, music, video and 3D.
Xingqian Xu, Atlas Wang, Eric Zhang, Kai Wang, and Humphrey Shi [arXiv] [GitHub]