JRFMzQnYXl@OpenReview

Total: 1

#1 Stab-SGD: Noise-Adaptivity in Smooth Optimization with Stability Ratios [PDF1] [Copy] [Kimi] [REL]

Authors: David A. R. Robin, Killian Bakong, Kevin Scaman

In the context of smooth stochastic optimization with first order methods, we introduce the stability ratio of gradient estimates, as a measure of local relative noise level, from zero for pure noise to one for negligible noise. We show that a schedule-free variant (Stab-SGD) of stochastic gradient descent obtained by just shrinking the learning rate by the stability ratio achieves real adaptivity to noise levels (i.e. without tuning hyperparameters to the gradient's variance), with all key properties of a good schedule-free algorithm: neither plateau nor explosion at intialization, and no saturation of the loss. We believe this theoretical development reveals the importance of estimating the local stability ratio in the construction of well-behaved (last-iterate) schedule-free algorithms, particularly when hyperparameter-tuning budgets are a small fraction of the total budget since noise-adaptivity and cheaper horizon-free tuning are most crucial in this regime.

Subject: NeurIPS.2025 - Poster