Towards Fully Quantized Neural Networks For Speech Enhancement

#1 Towards Fully Quantized Neural Networks For Speech Enhancement [PDF⁴] [Copy] [Kimi⁶] [REL]

Authors: Elad Cohen, Hai Victor Habi, Arnon Netzer

Deep learning models have shown state-of-the-art results in speech enhancement. However, deploying such models on an eight-bit integer-only device is challenging. In this work, we analyze the gaps in deploying a vanilla quantization-aware training method for speech enhancement, revealing two significant observations. First, quantization mainly affects signals with a high input Signal-to-Noise Ratio (SNR). Second, quantizing the model's input and output shows major performance degradation. Based on our analysis, we propose Fully Quantized Speech Enhancement (FQSE), a new quantization-aware training method that closes these gaps and enables eight-bit integer-only quantization. FQSE introduces data augmentation to mitigate the quantization effect on high SNR. Additionally, we add an input splitter and a residual quantization block to the model to overcome the error of the input-output quantization. We show that FQSE closes the performance gaps induced by eight-bit quantization.

Subject: INTERSPEECH.2023 - Speech Processing

cohen23@interspeech_2023@ISCA

#1 Towards Fully Quantized Neural Networks For Speech Enhancement [PDF4] [Copy] [Kimi6] [REL]

#1 Towards Fully Quantized Neural Networks For Speech Enhancement [PDF⁴] [Copy] [Kimi⁶] [REL]