PIXART-αhttps://pixart-alpha.github.io/
PIXART-α is a low-cost and efficient text-to-image (T2I) model that produces high-quality images. By utilizing a Transformer-based diffusion model, PIXART-α achieves state-of-the-art image generation standards and supports high-resolution synthesis. The authors propose three core designs, including a decomposed training strategy, an efficient T2I Transformer with cross-attention modules, and the use of high-informative data for auto-labeling. As a result, PIXART-α significantly reduces training time, saving costs and reducing CO2 emissions. The model demonstrates excellent image quality, artistry, and semantic control, making it a valuable tool for the AIGC community and startups seeking to build their own affordable and high-quality generative models.