jvP1wbD0xh@OpenReview

Total: 1

#1 Better to Teach than to Give: Domain Generalized Semantic Segmentation via Agent Queries with Diffusion Model Guidance [PDF3] [Copy] [Kimi1] [REL]

Authors: Fan Li, Xuan Wang, Min Qi, Zhaoxiang Zhang, yuelei xu

Domain Generalized Semantic Segmentation (DGSS) trains a model on a labeled source domain to generalize to unseen target domains with consistent contextual distribution and varying visual appearance.Most existing methods rely on domain randomization or data generation but struggle to capture the underlying scene distribution, resulting in the loss of useful semantic information. Inspired by the diffusion model's capability to generate diverse variations within a given scene context, we consider harnessing its rich prior knowledge of scene distribution to tackle the challenging DGSS task.In this paper, we propose a novel agent \textbf{Query}-driven learning framework based on \textbf{Diff}usion model guidance for DGSS, named QueryDiff. Our recipe comprises three key ingredients: (1) generating agent queries from segmentation features to aggregate semantic information about instances within the scene; (2) learning the inherent semantic distribution of the scene through agent queries guided by diffusion features; (3) refining segmentation features using optimized agent queries for robust mask predictions.Extensive experiments across various settings demonstrate that our method significantly outperforms previous state-of-the-art methods. Notably, it enhances the model's ability to generalize effectively to extreme domains, such as cubist art styles. Code is available at https://github.com/FanLiHub/QueryDiff.

Subject: ICML.2025 - Spotlight