Sequence-Adaptive Video Prediction in Continuous Streams using Diffusion Noise Optimization

#1 Sequence-Adaptive Video Prediction in Continuous Streams using Diffusion Noise Optimization [PDF] [Copy] [Kimi] [REL]

Authors: Sina Mokhtarzadeh Azar, Emad Bahrami, Enrico Pallotta, Gianpiero Francesca, Radu Timofte, Juergen Gall

In this work, we investigate diffusion-based video prediction models, which forecast future video frames, for continuous video streams. In this context, the models observe continuously new training samples, and we aim to leverage this to improve their predictions. We thus propose an approach that continuously adapts a pre-trained diffusion model to a video stream. Since fine-tuning the parameters of a large diffusion model is too expensive, we refine the diffusion noise during inference while keeping the model parameters frozen, allowing the model to adaptively determine suitable sampling noise. We term the approach Sequence Adaptive Video Prediction with Diffusion Noise Optimization (SAVi-DNO). To validate our approach, we introduce a new evaluation setting on the Ego4D dataset, focusing on simultaneous adaptation and evaluation on long continuous videos. Empirical results demonstrate improved performance based on FVD, SSIM, and PSNR metrics on long videos of Ego4D and OpenDV-YouTube, as well as videos of UCF-101 and SkyTimelapse, showcasing SAVi-DNO's effectiveness.

Subject: Computer Vision and Pattern Recognition

Publish: 2025-11-23 02:58:10 UTC

2511.18255

#1 Sequence-Adaptive Video Prediction in Continuous Streams using Diffusion Noise Optimization [PDF] [Copy] [Kimi] [REL]