Zhu_MambaML_Exploring_State_Space_Models_for_Multi-Label_Image_Classification@ICCV2025@CVF

Total: 1

#1 MambaML: Exploring State Space Models for Multi-Label Image Classification [PDF3] [Copy] [Kimi] [REL]

Authors: Xuelin Zhu, Jian Liu, Jiuxin Cao, Bing Wang

Mamba, a selective state-space model, has recently seen widespread application across various visual tasks due to its exceptional ability to capture long-range dependencies. While promising results have been demonstrated in image classification, its potential in multi-label image classification remains underexplored. To bridge this gap, we propose a novel Mamba-based decoder, which utilizes the intrinsic attention of Mamba to aggregate visual information from image features into label embeddings, yielding label-specific visual representations. Building upon this, a MambaML framework is developed for multi-label image classification, which models the self-correlations of image features and label embeddings with bi-directional Mamba, as well as their cross-correlations with the Mamba-based decoder, allowing visual spatial relationships, label semantic dependencies, and cross-modal associations to be explored in a unified system. In this way, robust label-specific visual representations are acquired, facilitating the training of binary classifiers towards accurate label recognition. Experiments on public benchmarks suggest that our MambaML achieves performance comparable to state-of-the-art methods in multi-label image classification, while requiring fewer parameters and computational overhead.

Subject: ICCV.2025 - Poster