LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation

#1 LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation [PDF²] [Copy] [Kimi¹] [REL]

Authors: Vladan Stojnić, Yannis Kalantidis, Jiří Matas, Giorgos Tolias

We propose a training-free method for open-vocabulary semantic segmentation using Vision-and-Language Models (VLMs). Our approach enhances the initial per-patch predictions of VLMs through label propagation, which jointly optimizes predictions by incorporating patch-to-patch relationships. Since VLMs are primarily optimized for cross-modal alignment and not for intra-modal similarity, we use a Vision Model (VM) that is observed to better captures these relationships. We address resolution limitations inherent to patch-based encoders by applying label propagation at the pixel level as a refinement step, significantly improving segmentation accuracy near class boundaries. Our method called LPOSS+, performs inference over the entire image, avoiding window-based processing and thereby capturing contextual interactions across the full image. LPOSS+ achieves state-of-the-art performance across a diverse set of datasets.

Subject: CVPR.2025 - Poster

Stojnic_LPOSS_Label_Propagation_Over_Patches_and_Pixels_for_Open-vocabulary_Semantic@CVPR2025@CVF

#1 LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation [PDF2] [Copy] [Kimi1] [REL]

#1 LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation [PDF²] [Copy] [Kimi¹] [REL]