Guo_Depth_Any_Camera_Zero-Shot_Metric_Depth_Estimation_from_Any_Camera@CVPR2025@CVF

Total: 1

#1 Depth Any Camera: Zero-Shot Metric Depth Estimation from Any Camera [PDF] [Copy] [Kimi] [REL]

Authors: Yuliang Guo, Sparsh Garg, S. Mahdi H. Miangoleh, Xinyu Huang, Liu Ren

Accurate metric depth estimation from monocular cameras is essential for applications such as autonomous driving, AR/VR, and robotics. While recent depth estimation methods demonstrate strong zero-shot generalization, achieving accurate metric depth across diverse camera types—particularly those with large fields of view (FoV) like fisheye and $360^\circ$ cameras—remains challenging. This paper introduces Depth Any Camera (DAC), a novel zero-shot metric depth estimation framework that extends a perspective-trained model to handle varying FoVs effectively. Notably, DAC is trained exclusively on perspective images, yet it generalizes seamlessly to fisheye and $360^\circ$ cameras without requiring specialized training. DAC leverages Equi-Rectangular Projection (ERP) as a unified image representation, enabling consistent processing of images with diverse FoVs. Key components include an efficient Image-to-ERP patch conversion for online ERP-space augmentation, a FoV alignment operation to support effective training across a broad range of FoVs, and multi-resolution data augmentation to address resolution discrepancies between training and testing. DAC achieves state-of-the-art zero-shot metric depth estimation, improving $\delta_1$ accuracy by up to 50\% on multiple indoor fisheye and $360^\circ$ datasets, demonstrating robust generalization across camera types while relying only on perspective training data.

Subject: CVPR.2025 - Poster