Data-Forcing Distillation: Restoring Diversity and Fidelity in Few-Step Video Generation

#1 Data-Forcing Distillation: Restoring Diversity and Fidelity in Few-Step Video Generation [PDF²] [Copy] [Kimi³] [REL]

Authors: Siyi Chen, Shaowei Liu, Yixuan Jia, Zian Wang, Huan Ling, Qing Qu, Jun Gao

Recent progress has shown promise in distilling multi-step video diffusion models into efficient few-step students. Among them, Distribution Matching Distillation (DMD) and its successor DMD2 achieved strong generation quality and fast convergence. However, due to the nature of the reverse Kullback--Leibler (KL) objective, these methods exhibit two persistent failure modes: a substantial drop in sample diversity, and visibly over-saturated outputs that deviate from real-video appearance. In this work, we propose Data-Forcing Distillation (DFD), a simple post-training framework that restores diversity and fidelity in DMD with only a single-line of code change. At its core is the teacher score discrepancy to guide the student toward the real-data distribution, pulling it to missing modes (mitigating mode collapse) and away from problematic modes absent in real data (avoiding over-saturation). We provide an in-depth theoretical analysis of our framework and validate our approach on text-to-video, image-to-video, and autoregressive video generation. With only 100--300 steps of finetuning, DFD effectively restores diversity and fidelity on both Wan2.1-1.3B and Cosmos-Predict2.5-2B model, resolving the over-saturation artifacts with significantly better video dynamics and appearance, and even outperforms the teacher model.

Subject: Computer Vision and Pattern Recognition

Publish: 2026-06-16 20:38:30 UTC

2606.18478

#1 Data-Forcing Distillation: Restoring Diversity and Fidelity in Few-Step Video Generation [PDF2] [Copy] [Kimi3] [REL]

#1 Data-Forcing Distillation: Restoring Diversity and Fidelity in Few-Step Video Generation [PDF²] [Copy] [Kimi³] [REL]