ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation

#1 ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation [PDF²] [Copy] [Kimi] [REL]

Authors: Yuan Zhou, Shilong Jin, Litao Hua, Wanjun Lv, Haoran Duan, Jungong Han

Recent advances in zero-shot text-to-3D generation have revolutionized 3D content creation by enabling direct synthesis from textual descriptions. While state-of-the-art methods leverage 3D Gaussian Splatting with score distillation to enhance multi-view rendering through pre-trained text-to-image (T2I) models, they suffer from inherent view biases in T2I priors. These biases lead to inconsistent 3D generation, particularly manifesting as the multi-face Janus problem, where objects exhibit conflicting features across views. To address this fundamental challenge, we propose ConsDreamer, a novel framework that mitigates view bias by refining both the conditional and unconditional terms in the score distillation process: (1) a View Disentanglement Module (VDM) that eliminates viewpoint biases in conditional prompts by decoupling irrelevant view components and injecting precise camera parameters; and (2) a similarity-based partial order loss that enforces geometric consistency in the unconditional term by aligning cosine similarities with azimuth relationships. Extensive experiments demonstrate that ConsDreamer effectively mitigates the multi-face Janus problem in text-to-3D generation, outperforming existing methods in both visual quality and consistency.

Subjects: Computer Vision and Pattern Recognition , Artificial Intelligence

Publish: 2025-04-03 06:43:23 UTC

2504.02316

#1 ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation [PDF2] [Copy] [Kimi] [REL]

#1 ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation [PDF²] [Copy] [Kimi] [REL]