3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation

#1 3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation [PDF⁹] [Copy] [Kimi²] [REL]

Authors: Hansheng Chen ; Bokui Shen ; Yulin Liu ; Ruoxi Shi ; Linqi Zhou ; Connor Z. Lin ; Jiayuan Gu ; Hao Su ; Gordon Wetzstein ; Leonidas Guibas

Multi-view image diffusion models have significantly advanced open-domain 3D object generation. However, most existing models rely on 2D network architectures that lack inherent 3D biases, resulting in compromised geometric consistency. To address this challenge, we introduce 3D-Adapter, a plug-in module designed to infuse 3D geometry awareness into pretrained image diffusion models. Central to our approach is the idea of 3D feedback augmentation: for each denoising step in the sampling loop, 3D-Adapter decodes intermediate multi-view features into a coherent 3D representation, then re-encodes the rendered RGBD views to augment the pretrained base model through feature addition. We study two variants of 3D-Adapter: a fast feed-forward version based on Gaussian splatting and a versatile training-free version utilizing neural fields and meshes. Our extensive experiments demonstrate that 3D-Adapter not only greatly enhances the geometry quality of text-to-multi-view models such as Instant3D and Zero123++, but also enables high-quality 3D generation using the plain text-to-image Stable Diffusion. Furthermore, we showcase the broad application potential of 3D-Adapter by presenting high quality results in text-to-3D, image-to-3D, text-to-texture, and text-to-avatar tasks.

Subjects: Computer Vision and Pattern Recognition ; Artificial Intelligence

Publish: 2024-10-24 17:59:30 UTC

2410.18974

#1 3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation [PDF9] [Copy] [Kimi2] [REL]

#1 3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation [PDF⁹] [Copy] [Kimi²] [REL]