Faster Double Adaptive Gradient Methods

#1 Faster Double Adaptive Gradient Methods [PDF²] [Copy] [Kimi²] [REL]

In this paper, we propose a class of faster double adaptive gradient methods to solve nonconvex finite-sum optimization problems possibly with nonsmooth regularization by simultaneously using adaptive learning rate and adaptive mini-batch size. Specifically, we first propose a double adaptive stochastic gradient method (i.e., 2AdaSGD), and prove that our 2AdaSGD obtains a low stochastic first-order oracle (SFO) complexity for finding a stationary solution under the population smoothness condition. Furthermore, we propose a variance reduced double adaptive stochastic gradient method (i.e., 2AdaSPIDER), and prove that our 2AdaSPIDER obtains an optimal SFO complexity under the average smoothness condition, which is lower than the SFO complexity of the existing double adaptive gradient algorithms. In particular, we introduce a new stochastic gradient mapping to adaptively adjust mini-batch size in our stochastic gradient methods. We conduct some numerical experiments to verify efficiency of our proposed methods.

Subject: AAAI.2025 - Search and Optimization

34908@AAAI

#1 Faster Double Adaptive Gradient Methods [PDF2] [Copy] [Kimi2] [REL]

#1 Faster Double Adaptive Gradient Methods [PDF²] [Copy] [Kimi²] [REL]