patil17@interspeech_2017@ISCA

Total: 1

#1 Novel Variable Length Teager Energy Separation Based Instantaneous Frequency Features for Replay Detection [PDF] [Copy] [Kimi1]

Authors: Hemant A. Patil ; Madhu R. Kamble ; Tanvina B. Patel ; Meet H. Soni

Replay attacks presents a great risk for Automatic Speaker Verification (ASV) system. In this paper, we propose a novel replay detector based on Variable length Teager Energy Operator-Energy Separation Algorithm-Instantaneous Frequency Cosine Coefficients (VESA-IFCC) for the ASV spoof 2017 challenge. The key idea here is to exploit the contribution of IF in each subband energy via ESA to capture possible changes in spectral envelope (due to transmission and channel characteristics of replay device) of replayed speech. The IF is computed from narrowband components of speech signal, and DCT is applied in IF to get proposed feature set. We compare the performance of the proposed VESA-IFCC feature set with the features developed for detecting synthetic and voice converted speech. This includes the CQCC, CFCCIF and prosody-based features. On the development set, the proposed VESA-IFCC features when fused at score-level with a variant of CFCCIF and prosody-based features gave the least EER of 0.12%. On the evaluation set, this combination gave an EER of 18.33%. However, post-evaluation results of challenge indicate that VESA-IFCC features alone gave the relatively least EER of 14.06% (i.e., relatively 16.11% less compared to baseline CQCC) and hence, is a very useful countermeasure to detect replay attacks.