Learning from A Single Markovian Trajectory: Optimality and Variance Reduction

#1 Learning from A Single Markovian Trajectory: Optimality and Variance Reduction [PDF] [Copy] [Kimi] [REL]

In this paper, we consider the general stochastic non-convex optimization problem when the sampling process follows a Markov chain. This problem exhibits its significance in capturing many real-world applications, ranging from asynchronous distributed learning to reinforcement learning. In particular, we consider the worst case where one has no prior knowledge and control of the Markov chain, meaning multiple trajectories cannot be simulated but only a single trajectory is available for algorithm design. We first provide algorithm-independent lower bounds with $\Omega(\epsilon^{-3})$ (and $\Omega(\epsilon^{-4})$) samples, when objectives are (mean-squared) smooth, for any first-order methods accessing bounded variance gradient oracles to achieve $\epsilon$-approximate critical solutions of original problems. Then, we propose \textbf{Ma}rkov-\textbf{C}hain \textbf{SPIDER} (MaC-SPIDER), which leverages variance-reduced techniques, to achieve a $\mathcal{O}(\epsilon^{-3})$ upper bound for mean-squared smooth objective functions. To the best of our knowledge, MaC-SPIDER is the first to achieve $\mathcal{O}(\epsilon^{-3})$ complexity when sampling from a single Markovian trajectory. And our proposed lower bound concludes its (near) optimality.

Subject: NeurIPS.2025 - Poster

2oRz0NNOQQ@OpenReview

#1 Learning from A Single Markovian Trajectory: Optimality and Variance Reduction [PDF] [Copy] [Kimi] [REL]