2507.08227

Total: 1

#1 RawTFNet: A Lightweight CNN Architecture for Speech Anti-spoofing [PDF1] [Copy] [Kimi2] [REL]

Authors: Yang Xiao, Ting Dang, Rohan Kumar Das

Automatic speaker verification (ASV) systems are often affected by spoofing attacks. Recent transformer-based models have improved anti-spoofing performance by learning strong feature representations. However, these models usually need high computing power. To address this, we introduce RawTFNet, a lightweight CNN model designed for audio signals. The RawTFNet separates feature processing along time and frequency dimensions, which helps to capture the fine-grained details of synthetic speech. We tested RawTFNet on the ASVspoof 2021 LA and DF evaluation datasets. The results show that RawTFNet reaches comparable performance to that of the state-of-the-art models, while also using fewer computing resources. The code and models will be made publicly available.

Subjects: Audio and Speech Processing , Sound

Publish: 2025-07-11 00:24:47 UTC