Wu_Diorama_Unleashing_Zero-shot_Single-view_3D_Indoor_Scene_Modeling@ICCV2025@CVF

Total: 1

#1 Diorama: Unleashing Zero-shot Single-view 3D Indoor Scene Modeling [PDF4] [Copy] [Kimi1] [REL]

Authors: Qirui Wu, Denys Iliash, Daniel Ritchie, Manolis Savva, Angel X. Chang

Reconstructing structured 3D scenes from RGB images using CAD objects unlocks efficient and compact scene representations that maintain compositionality and interactability. Existing works propose training-heavy methods relying on either expensive yet inaccurate real-world annotations or controllable yet monotonous synthetic data that do not generalize well to unseen objects or domains. We present Diorama, the first zero-shot open-world system that holistically models 3D scenes from single-view RGB observations without requiring end-to-end training or human annotations. We show the feasibility of our approach by decomposing the problem into subtasks and introduce better solutions to each: architecture reconstruction, 3D shape retrieval, object pose estimation, and scene layout optimization. We evaluate our system on both synthetic and real-world data to show we significantly outperform baselines from prior work. We also demonstrate generalization to real-world internet images and the text-to-scene task.

Subject: ICCV.2025 - Highlight