Tokenizing Motion: A Generative Approach for Scene Dynamics Compression

#1 Tokenizing Motion: A Generative Approach for Scene Dynamics Compression [PDF] [Copy] [Kimi] [REL]

Authors: Shanzhi Yin, Zihan Zhang, Bolin Chen, Shiqi Wang, Yan Ye

This paper proposes a novel generative video compression framework that leverages motion pattern priors, derived from subtle dynamics in common scenes (e.g., swaying flowers or a boat drifting on water), rather than relying on video content priors (e.g., talking faces or human bodies). These compact motion priors enable a new approach to ultra-low bitrate communication while achieving high-quality reconstruction across diverse scene contents. At the encoder side, motion priors can be streamlined into compact representations via a dense-to-sparse transformation. At the decoder side, these priors facilitate the reconstruction of scene dynamics using an advanced flow-driven diffusion model. Experimental results illustrate that the proposed method can achieve superior rate-distortion-performance and outperform the state-of-the-art conventional-video codec Enhanced Compression Model (ECM) on-scene dynamics sequences. The project page can be found at-https://github.com/xyzysz/GNVDC.

Subjects: Computer Vision and Pattern Recognition , Image and Video Processing

Publish: 2024-10-13 07:54:02 UTC

2410.09768

#1 Tokenizing Motion: A Generative Approach for Scene Dynamics Compression [PDF] [Copy] [Kimi] [REL]