2512.10262

Total: 1

#1 VLM-NCD:Novel Class Discovery with Vision-Based Large Language Models [PDF3] [Copy] [Kimi] [REL]

Authors: Yuetong Su, Baoguo Wei, Xinyu Wang, Xu Li, Lixin Li

Novel Class Discovery aims to utilise prior knowledge of known classes to classify and discover unknown classes from unlabelled data. Existing NCD methods for images primarily rely on visual features, which suffer from limitations such as insufficient feature discriminability and the long-tail distribution of data. We propose LLM-NCD, a multimodal framework that breaks this bottleneck by fusing visual-textual semantics and prototype guided clustering. Our key innovation lies in modelling cluster centres and semantic prototypes of known classes by jointly optimising known class image and text features, and a dualphase discovery mechanism that dynamically separates known or novel samples via semantic affinity thresholds and adaptive clustering. Experiments on the CIFAR-100 dataset show that compared to the current methods, this method achieves up to 25.3% improvement in accuracy for unknown classes. Notably, our method shows unique resilience to long tail distributions, a first in NCD literature.

Subject: Computer Vision and Pattern Recognition

Publish: 2025-12-11 03:53:50 UTC