2601.10000

Total: 1

#1 EditEmoTalk: Controllable Speech-Driven 3D Facial Animation with Continuous Expression Editing [PDF2] [Copy] [Kimi] [REL]

Authors: Diqiong Jiang, Kai Zhu, Dan Song, Jian Chang, Chenglizhao Chen, Zhenyu Wu

Speech-driven 3D facial animation aims to generate realistic and expressive facial motions directly from audio. While recent methods achieve high-quality lip synchronization, they often rely on discrete emotion categories, limiting continuous and fine-grained emotional control. We present EditEmoTalk, a controllable speech-driven 3D facial animation framework with continuous emotion editing. The key idea is a boundary-aware semantic embedding that learns the normal directions of inter-emotion decision boundaries, enabling a continuous expression manifold for smooth emotion manipulation. Moreover, we introduce an emotional consistency loss that enforces semantic alignment between the generated motion dynamics and the target emotion embedding through a mapping network, ensuring faithful emotional expression. Extensive experiments demonstrate that EditEmoTalk achieves superior controllability, expressiveness, and generalization while maintaining accurate lip synchronization. Code and pretrained models will be released.

Subjects: Multimedia , Computer Vision and Pattern Recognition

Publish: 2026-01-15 02:21:22 UTC