Wang_CogCM_Cognition-Inspired_Contextual_Modeling_for_Audio-Visual_Speech_Enhancement@ICCV2025@CVF

Total: 1

#1 CogCM: Cognition-Inspired Contextual Modeling for Audio-Visual Speech Enhancement [PDF] [Copy] [Kimi] [REL]

Authors: Feixiang Wang, Shuang Yang, Shiguang Shan, Xilin Chen

Audio-Visual Speech Enhancement (AVSE) leverages both audio and visual information to improve speech quality. Despite noisy real-world conditions, humans are generally able to perceive and interpret corrupted speech segments as clear. Researches in cognitive science have shown how the brain merges auditory and visual inputs to achieve this, which mainly manifests in four insights: (1) Humans utilize high-level semantic context to reconstruct corrupted speech signals, highlighting the importance of semantics. (2) Visual cues are shown to strongly correlate with semantic information, enabling visual cues to facilitate semantic context modeling. (3) Visual appearance and vocal information jointly benefit identification, implying that visual cues strengthen low-level signal context modeling. (4) High-level semantic knowledge and low-level auditory processing operate concurrently, allowing the semantics to guide signal-level context modeling. Motivated by these insights, we propose CogCM, a cognition-inspired hierarchical contextual modeling framework. The CogCM framework includes three core modules: (1) A semantic context modeling module (SeCM) to capture high-level semantic context from both audio and visual modalities; (2) A signal context modeling module (SiCM) to model fine-grained temporal-spectral structures under multi-modal semantic context guidance; (3) A semantic-to-signal guidance module (SSGM) to leverage semantic context in guiding signal context modeling across both temporal and frequency dimensions. Extensive experiments on 7 benchmarks demonstrate CogCM's superiority, especially achieving 63.6% SDR and 58.1% PESQ improvements at -15dB SNR -- outperforming state-of-the-art methods across all metrics.

Subject: ICCV.2025 - Poster