Wang_Open_Ad-hoc_Categorization_with_Contextualized_Feature_Learning@CVPR2025@CVF

Total: 1

#1 Open Ad-hoc Categorization with Contextualized Feature Learning [PDF] [Copy] [Kimi] [REL]

Authors: Zilin Wang, Sangwoo Mo, Stella X. Yu, Sima Behpour, Liu Ren

Unlike common categories for plants and animals, ad-hoc categories such as things to sell at a garage sale are created to help people achieve a certain task. Likewise, AI agents need to adaptively categorize visual scenes in response to changing tasks. We thus study open ad-hoc categorization, where we learn to infer novel concepts and name images according to a varying categorization purpose, a few labeled exemplars, and many unlabeled images.We develop a simple method that combines top-down text guidance (CLIP) with bottom-up image clustering (GCD) to learn contextualized visual features and align visual clusters with CLIP semantics, enabling predictions for both known and novel classes. Benchmarked on multi-label datasets Stanford and Clevr-4, our so-called OAK significantly outperforms baselines in providing accurate predictions across contexts and identifying novel concepts, e.g., it achieves 87.4% novel accuracy on Stanford Mood, surpassing CLIP and GCD by over 50%. OAK offers interpretable saliency maps, focusing on hands, faces, and backgrounds for Action, Mood, and Location contexts, respectively.

Subject: CVPR.2025 - Poster