2510.18291

Total: 1

#1 GeoDiff: Geometry-Guided Diffusion for Metric Depth Estimation [PDF5] [Copy] [Kimi] [REL]

Authors: Tuan Pham, Thanh-Tung Le, Xiaohui Xie, Stephan Mandt

We introduce a novel framework for metric depth estimation that enhances pretrained diffusion-based monocular depth estimation (DB-MDE) models with stereo vision guidance. While existing DB-MDE methods excel at predicting relative depth, estimating absolute metric depth remains challenging due to scale ambiguities in single-image scenarios. To address this, we reframe depth estimation as an inverse problem, leveraging pretrained latent diffusion models (LDMs) conditioned on RGB images, combined with stereo-based geometric constraints, to learn scale and shift for accurate depth recovery. Our training-free solution seamlessly integrates into existing DB-MDE frameworks and generalizes across indoor, outdoor, and complex environments. Extensive experiments demonstrate that our approach matches or surpasses state-of-the-art methods, particularly in challenging scenarios involving translucent and specular surfaces, all without requiring retraining.

Subject: Computer Vision and Pattern Recognition

Publish: 2025-10-21 04:47:36 UTC