BZSTC2RMEd@OpenReview

Total: 1

#1 Pruning-Robust Mamba with Asymmetric Multi-Scale Scanning Paths [PDF] [Copy] [Kimi] [REL]

Authors: Jindi Lv, Yuhao Zhou, Mingjia Shi, Zhiyuan Liang, Panpan Zhang, Xiaojiang Peng, Wangbo Zhao, Zheng Zhu, Jiancheng Lv, Qing Ye, Kai Wang

Mamba has proven efficient for long-sequence modeling in vision tasks. However, when token reduction techniques are applied to improve efficiency, Mamba-based models exhibit drastic performance degradation compared to Vision Transformers (ViTs). This decline is potentially attributed to Mamba's chain-like scanning mechanism, which we hypothesize not only induces cascading losses in token connectivity but also limits the diversity of spatial receptive fields. In this paper, we propose Asymmetric Multi-scale Vision Mamba (AMVim), a novel architecture designed to enhance pruning robustness. AMVim employs a dual-path structure, integrating a window-aware scanning mechanism into one path while retaining sequential scanning in the other. This asymmetry design promotes token connection diversity and enables multi-scale information flow, reinforcing spatial awareness. Empirical results demonstrate that AMVim achieves state-of-the-art pruning robustness. During token reduction, AMVim-T achieves a substantial 34\% improvement in training-free accuracy with identical model sizes and FLOPs. Meanwhile, AMVim-S exhibits only a 1.5\% accuracy drop, performing comparably to ViT. Notably, AMVim also delivers superior performance during pruning-free settings, further validating its architectural advantages.

Subject: NeurIPS.2025 - Poster