2503.07357

Total: 1

#1 Impact of Microphone Array Mismatches to Learning-based Replay Speech Detection [PDF] [Copy] [Kimi] [REL]

Authors: Michael Neri, Tuomas Virtanen

In this work, we investigate the generalization of a multi-channel learning-based replay speech detector, which employs adaptive beamforming and detection, across different microphone arrays. In general, deep neural network-based microphone array processing techniques generalize poorly to unseen array types, i.e., showing a significant training-test mismatch of performance. We employ the ReMASC dataset to analyze performance degradation due to inter- and intra-device mismatches, assessing both single- and multi-channel configurations. Furthermore, we explore fine-tuning to mitigate the performance loss when transitioning to unseen microphone arrays. Our findings reveal that array mismatches significantly decrease detection accuracy, with intra-device generalization being more robust than inter-device. However, fine-tuning with as little as ten minutes of target data can effectively recover performance, providing insights for practical deployment of replay detection systems in heterogeneous automatic speaker verification environments.

Subjects: Audio and Speech Processing , Signal Processing

Publish: 2025-03-10 14:14:35 UTC