Total: 1
Extensive research has shown that a wide range of machine learning problems can be formulated as bilevel optimization, where two levels of learning processes intertwine through distinct sets of optimization variables. However, prevailing approaches often impose stringent assumptions, such as strong convexity of the lower-level loss function or uniqueness of the optimal solution, to enable algorithmic development and convergence analysis. However, these assumptions tend to be overly restrictive in real-world scenarios. In this work, we explore a recently popularized Moreau envelope based reformulation of bilevel optimization problems, accommodating nonconvex objective functions at both levels. We propose a stochastic primal-dual method that incorporates smoothing on both sides, capable of finding Karush-Kuhn-Tucker solutions for this general class of nonconvex bilevel optimization problems. A key feature of our algorithm is its ability to dynamically weigh the lower-level problems, enhancing its performance, particularly in stochastic learning scenarios. Numerical experiments underscore the superiority of our proposed algorithm over existing penalty-based methods in terms of both the convergence rate and the test accuracy.