2606.08554

Total: 1

#1 A Theoretical Analysis of Memory and Overfitting Phenomena in Stochastic Interpolation Models [PDF] [Copy] [Kimi1] [REL]

Authors: Yunchen Li, Shaohui Lin, Zhou Yu

This paper provides a theoretical account of memorization in stochastic interpolation models. By leveraging closed-form expressions for the optimal velocity field and the associated score function, we show that, in the continuous-time oracle setting, both deterministic and stochastic generation processes recover training samples. Under Euler discretization, generated samples remain centered around training samples, with deviations controlled by the step size. We further analyze generation in the presence of estimation errors and show that accumulated estimation errors control the endpoint deviation from the training set. These results imply that the generated sample admits a representation as a training sample perturbed by three controlled terms: a discretization-induced bound, an estimation-error-induced bound, and stochastic Gaussian noise. Based on this characterization, we provide theoretical definitions of overfitting and underfitting in generative models. Synthetic simulations support our theoretical findings.

Subject: Machine Learning

Publish: 2026-06-07 10:14:07 UTC