The Gsgen project introduces a novel approach called Gaussian Splatting based text-to-3D GENeration (Gsgen) that generates high-quality, multi-view consistent 3D assets. Previous methods lacked accurate geometry and fidelity, so Gsgen leverages 3D Gaussian Splatting as a representation technique to address these limitations. Their approach involves a progressive optimization strategy, including a geometry optimization stage and an appearance refinement stage. The geometry optimization establishes a coarse representation under a 3D geometry prior, while the appearance refinement iteratively refines the obtained Gaussians to enhance details. The method proves effective in generating 3D assets with accurate geometry and delicate details
"Magic123" is a two-stage solution for generating high-quality 3D meshes from single images. It uses 2D and 3D priors to optimize a neural radiance field in the first stage, creating a coarse geometry. The second stage utilizes a memory-efficient mesh representation to produce a high-resolution mesh with appealing texture. Through reference view supervision and diffusion priors, the approach generates novel views. The system incorporates a tradeoff parameter for controlling the balance between exploration and precision in the generated geometry.
ControlNet, a neural network structure that adds extra conditions to diffusion models to control them. ControlNet copies the weights of neural network blocks into a "locked" copy and a "trainable" copy, allowing the "trainable" copy to learn the condition while preserving the production-ready diffusion models. The "zero convolution" is used to prevent distortion during training, and no layer is trained from scratch, making it safe to use with small-scale or personal devices. The text also explains how ControlNet can be used with Stable Diffusion to reuse the SD encoder as a powerful backbone for learning diverse controls. The efficacy of the SD encoder as a backbone is validated through various evidences.
Tips for using ControlNet include adding negative prompts and ignoring canvas height/width. The addon have also T2I-Adapter support and experimental features such as CFG-based ControlNet, Guess Mode, and Multi-ControlNet/Joint Conditioning. The weight and guidance strength/start/end are factors that affect the influence of ControlNet on the original SD Unet.
FoldFold allExpandExpand allAre you sure you want to delete this link?Are you sure you want to delete this tag?
The personal, minimalist, super-fast, database free, bookmarking service by the Shaarli community