Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.https://github.com/SUDO-AI-3D/zero123plus#zero123-a-single-image-to-consistent-multi-view-diffusion-base-model
Zero123++ is an advanced image-conditioned diffusion model highlighted in this paper, focusing on generating 3D-consistent multi-view images from a single input view. The model minimizes the effort of fine-tuning by leveraging pre-trained 2D generative priors, particularly from StableDiffusion. Noteworthy improvements include tiling six views into a single image, shifting the noise schedule, implementing scaled reference attention for local conditioning, and introducing FlexDiffuse for global conditioning. Through these enhancements, Zero123++ excels in producing high-quality, consistent multi-view images, overcoming issues like texture degradation and geometric misalignment. The paper showcases the model's effectiveness through qualitative and quantitative comparisons with leading models, emphasizing its potential for various applications. Additionally, a depth-controlled version of Zero123++ is introduced, demonstrating superior performance with ControlNet and highlighting the model's versatility.