Can LLMs Help Localize Fake Words in Partially Fake Speech?

#1 Can LLMs Help Localize Fake Words in Partially Fake Speech? [PDF] [Copy] [Kimi¹] [REL]

Authors: Lin Zhang, Thomas Thebaud, Zexin Cai, Sanjeev Khudanpur, Daniel Povey, Leibny Paola García-Perera, Matthew Wiesner, Nicholas Andrews

Large language models (LLMs), trained on large-scale text, have recently attracted significant attention for their strong performance across many tasks. Motivated by this, we investigate whether a text-trained LLM can help localize fake words in partially fake speech, where only specific words within a speech are edited. We build a speech LLM to perform fake word localization via next token prediction. Experiments and analyses on AV-Deepfake1M and PartialEdit indicates that the model frequently leverages editing-style pattern learned from the training data, particularly word-level polarity substitutions for those two databases we discussed, as cues for localizing fake words. Although such particular patterns provide useful information in an in-domain scenario, how to avoid over-reliance on such particular pattern and improve generalization to unseen editing styles remains an open question.

Subjects: Audio and Speech Processing , Sound

Publish: 2026-03-11 18:21:14 UTC

2603.11205

#1 Can LLMs Help Localize Fake Words in Partially Fake Speech? [PDF] [Copy] [Kimi1] [REL]

#1 Can LLMs Help Localize Fake Words in Partially Fake Speech? [PDF] [Copy] [Kimi¹] [REL]