sun25d@interspeech_2025@ISCA

Total: 1

#1 Scaling beyond Denoising: Submitted System and Findings in URGENT Challenge 2025 [PDF] [Copy] [Kimi] [REL]

Authors: Zhihang Sun, Andong Li, Tong Lei, Rilin Chen, Meng Yu, Chengshi Zheng, Yi Zhou, Dong Yu

Although deep neural networks have remarkably facilitated speech enhancement (SE), current studies focus on noise suppression and narrow benchmarks, with limited exploration of realistic scenarios. This paper presents a system that jointly addresses multiple degradations through an end-to-end framework. Specifically, a variant of dual-path architecture in the time-frequency domain is proposed, involving a fast band-processing strategy to enable parallel band split and merge operations, and a channel-mixing module based on Fourier Analysis Networks to facilitate target estimation. Besides, to enable training with limited GPU resources, we propose a simple yet effective progressive block extension strategy to support model scaling training with limited GPU resources. We participate in the URGENT Challenge and our submitted system ranked first on Track 1. The related findings and analysis of scaling effects provide diverse insights for building universal SE systems.

Subject: INTERSPEECH.2025 - Modelling and Learning