Convergence, design and training of continuous-time dropout as a random batch method

#1 Convergence, design and training of continuous-time dropout as a random batch method [PDF] [Copy] [Kimi¹] [REL]

Authors: Antonio Álvarez-López, Martín Hernández

We study dropout regularization in continuous-time models through the lens of random-batch methods -- a family of stochastic sampling schemes originally devised to reduce the computational cost of interacting particle systems. We construct an unbiased, well-posed estimator that mimics dropout by sampling neuron batches over time intervals of length $h$. Trajectory-wise convergence is established with linear rate in $h$ for the expected uniform error. At the distribution level, we establish stability for the associated continuity equation, with total-variation error of order $h^{1/2}$ under mild moment assumptions. During training with fixed batch sampling across epochs, a Pontryagin-based adjoint analysis bounds deviations in the optimal cost and control, as well as in gradient-descent iterates. On the design side, we compare convergence rates for canonical batch sampling schemes, recover standard Bernoulli dropout as a special case, and derive a cost--accuracy trade-off yielding a closed-form optimal $h$. We then specialize to a single-layer neural ODE and validate the theory on classification and flow matching, observing the predicted rates, regularization effects, and favorable runtime and memory profiles.

Subjects: Machine Learning , Optimization and Control

Publish: 2025-10-15 04:19:01 UTC

2510.13134

#1 Convergence, design and training of continuous-time dropout as a random batch method [PDF] [Copy] [Kimi1] [REL]

#1 Convergence, design and training of continuous-time dropout as a random batch method [PDF] [Copy] [Kimi¹] [REL]