2026-06-19 | | Total: 238
Bilevel optimization is an indispensable modeling tool for modern machine learning and engineering design. However, the theory and practice for finding second order stationary points in the context of bilevel optimization still remain largely unsettled. Even for bilevel optimization with strongly convex lower-level problem, the hyperfunction it induces is in general nonconvex. Although the Cubic Regularized Newton methods (CRN) famously achieve the optimal $\mathcal{O}(\varepsilon^{-1.5})$ SOSP (second-order stationary point) rate in single-level optimization, it is unclear how to control the accuracy of the hypergradient and hyper-Hessian computations in the context of applying the second-order methods to bilevel problems in order for the overall process to be efficient. In this paper, we set out to answer this question. In particular, we first formulate a double loop CRN baseline that achieves the optimal outer rate but requires repeated lower level solves. Next, we propose a single loop cubic regularized Newton algorithm that combines one lower-level gradient step with one Newton step for the hypergradient, and prove an overall deterministic $\mathcal{O}(\varepsilon^{-1.5})$ total oracle complexity, which is optimal. In addition, we illustrate that some intuitively simple modifications of our method may fail to hold up the convergence result. To the best of our knowledge, this is the first deterministic single loop method for unconstrained NCSC (non-convex upper-level and strongly convex lower-level) bilevel optimization setting that achieves the $\mathcal{O}(\varepsilon^{-1.5})$ optimal convergence rate for finding an $\varepsilon$-SOSP of the hyperfunction.
In light of recent advances in conformal blow-up methods for the positive mass theorem, including He--Shi--Yu, Bi--Hao--He--Shi--Zhu, and Brendle--Wang, we develop a Schoen--Yau type singular dimension descent method for positive scalar curvature obstructions in arbitrary dimensions. We prove obstructions to positive scalar curvature on enlargeable manifolds and establish the corresponding cubical width inequalities and two-systole estimates. The method also applies to enlargeable AM--PI spaces, giving a positive scalar curvature obstruction when the singular set has Assouad codimension greater than \(3-2/n\).
An elastic space curve is a critical point of the bending energy subject to appropriate constraints. An analytic representation, equivalent to the spherical pendulum equation, leads to an 11-parameter description of the space of 3D elastic curve segments. We give a numerically stable method for recovering the 11 parameters from a given elastic curve segment. Using this, we give a fast and stable method to approximate an arbitrary space curve segment by a 3D elastica. Applications include interactive design with exact elastic curves and CAD surface rationalization for robotic hot-blade cutting.
Throughout this work, we will carry out a rigorous mathematical analysis of a class of control systems that is widely used in applications but still lacks a consistent theoretical foundation for describing the types of limit sets that may arise from its dynamics. There are applications in which, for example, a treatment for a given disease is administered until the level of diseased cells falls below a prescribed threshold C1. At that point, the treatment is suspended in order to allow the patient's organism to recover from its side effects. Subsequently, when the level of diseased cells reaches a second threshold C2 bigger than C1, the treatment is resumed, and the protocol is repeated. To the best of our knowledge, there is not a mathematical classification of such models. In this paper, we initiate what is intended to become a consistent body of literature aimed at determining the limit sets of such models. We begin with the planar case, in which two linear vector fields are active and two switching boundaries are considered. Naturally, in future developments, control systems in higher dimensions, featuring additional vector fields and more general switching manifolds, should also be considered.
We investigate the spectral properties of the Anderson operator perturbed by a localized negative potential, \(-V\). Specifically, we analyze the random Schrödinger operator defined by \(H = -Δ+\ve \sum_{n} ω_n χ_n - V\), where the unperturbed operator exhibits a disordered energy landscape. Our primary focus is to establish precise estimates on the number of negative eigenvalues (bound states) induced by the attractive perturbation. By analyzing the competition between Anderson localization and the binding capacity of the potential, we provide quantitative bounds on the discrete spectrum. These results offer new insights into how randomness enhances the eigenvalue bounds.
We introduce the CLUSTER algorithm (\textbf{c}oordinate-\textbf{l}evel \textbf{u}pdate \textbf{s}trategy for \textbf{t}rust-region step \textbf{e}valuation \textbf{r}efinement) for local derivative-free optimization problems where there is a cost to changing each parameter (or clusters of parameters). For example, this type of cost model is appropriate for optimizing robot-controlled laboratory experiments, in which a robot may incur a separate motion for each parameter cluster to be adjusted. We build off of a class of quadratic-interpolation optimization algorithms by Powell and Conn that are known to perform well for twice-differentiable objectives (e.g. low-noise experiments), and show that the CLUSTER variants improve performance on a variety of test problems (including an optics laboratory experiment) by around 50$\%$, and greatly outperform common competing algorithms for laboratory optimization (Bayesian optimization and Nelder--Mead). We also adapt the convergence proof of the Conn algorithm to obtain a similar convergence guarantee for CLUSTER-Conn.
WepresentatwolevelSchwarzmethodasanalternativetoAlgebraicMultigridmethod(AMG) used as the last level (coarse) solver of the p-multigrid pMG preconditioner for pressure Poission equation resulting from Spectral/Finite element descretization of incompressible Navier-Stokes eqaution. Proposed Schwarz method consits of a local problem in the original pMG coarse space and a global coarse problem. Main contribution of the paper is a novel, structured and a non-nested coarse space for the global coarse problem. Structured nature of the proposed global coarse space enable communication-free interpolation between the original p-multgrid coarse space and the global coarse problem. We demonstrate the effectiveness of the proposed method compared to the state of the art AMG solver BoomerAMG by a series of experiments performed using Nek5000/RS, a suite of highly scalable incompressible Navier-Stokes solvers, on Summit/Frontier supercomputers at Oak Ridge Leadership Computing Facility.
The \emph{minimum positive codegree} $δ^+_{k-1}(G)$ of a $k$-graph $G$ is the minimum, over all $(k-1)$-sets that lie in at least one edge, of the number of edges containing that set. The \emph{positive codegree Turán density} of a $k$-graph family $\mathcal{F}$ is the asymptotically maximum value of $δ^+_{k-1}(G)/n$ over all $\mathcal{F}$-free $k$-graphs $G$ with $n\to\infty$ vertices. In this note, we establish a strong version of non-principality with respect to this density by proving that for every $k\ge3$ there exist two $k$-graphs $F_1$ and $F_2$ such that $$ 0<γ^+(F_1, F_2) < \min\{γ^+(F_1), γ^+(F_2)\}. $$
We consider a family of optimization problems, based on a mean-field description of particles interacting through Coulomb forces in a quadratic trap. In addition, the particles are constrained to lie in a halfspace and we are interested in the way the particle distribution changes as the halfspace varies. In particular, we can prove the existence of a phase transition, thereby settling a recent conjecture by Byun, Forrester, Majumdar and Schehr.
Given a fixed past-directed timelike vector field, does there exist a time function whose gradient is optimally aligned with it? We address this question by introducing a functional that, on the one hand, captures the misalignment between the timelike vector field and the gradients of suitable Sobolev functions, and, on the other hand, penalizes null gradients. Our analysis focuses on compact subsets of smooth stably causal spacetimes. More precisely, we prove that, under suitable assumptions on the Sobolev index and the strength of the null gradient penalization, there exists a unique smooth temporal function which minimizes the considered functional. We refer to this minimizer as the \emph{alignment time function}. Furthermore, several useful properties of the alignment time function are established: there exists a canonical procedure to improve its steepness, it is stable under $C^{p}$ convergence of the underlying metrics and vector fields and it inherits the symmetries shared by the metric and the given vector field.
We consider contraction of Bayesian posterior distributions in nonparametric settings where coefficients of a function over a basis or dictionary are given priors with $p$--exponential tails, including Laplace tails $(p=1)$ and heavier tails $(p<1)$. It is shown that contraction rates improve as $p$ decreases and that full adaptation to smoothness, up to logarithmic factors, is obtained in an appropriate $p\to 0$ regime. As applications, we consider both series priors in white noise regression and shallow ReLU neural networks in random design regression. In particular, we show that overparametrised shallow ReLU networks can adapt to any regularity $0\le β\le 2$. Through a simulation study, we show strong empirical agreement with the behavior predicted by our theory.
In this paper, a braid is regarded as a dynamical system of points in the plane. The states of this dynamical system are given by Delaunay triangulations. This construction makes it possible to define an abstract groupoid $\overset{abc}{\mathcal{G}^{4}_{n+3}}$, which gives a representation of the colored braid groupoid $\text{ColB}(n)$. We define homomorphisms ${f}_{n+3}:\overset{abc}{\mathcal{G}^{4}_{n+3}} \rightarrow\text{GL}_{2n+1}(\mathbb{Q})$ and ${f}'_{n+3}:\overset{abc}{\mathcal{G}^{4}_{n+3}} \rightarrow\text{GL}_{2n+1}(\mathbb{C})$, and describe an algorithm for computing the resulting invariants.
We introduce the notion of a coarsely minimal Reeb flow, generalizing the notion of minimal geodesic flow, and prove the following rigidity theorem: That a coarsely minimal Reeb flow satisfying a divergence property is orbitally equivalent to the geodesic flow of a Riemannian metric of negative sectional curvature. Without the divergence assumption, we obtain an orbital semi-equivalence. This extends a rigidity result for geodesic flows of negatively curved Riemannian metrics which is due to Gromov. We use Floer homology and Morse's hyperbolic `stability' Lemma.
We prove that a lacunary hyperbolic group $G = \varinjlim G_i$ with sufficient generics is selfless in the sense of Amrutam--Gao--Kunnawalkam Elayavalli--Patchell, provided the hyperbolicity constants $δ_i$ and injectivity radii $r_i$ satisfy $δ_i(\log r_i)^{7} = o(r_i)$. The proof replaces the acylindricity-based machinery of that work with a direct geodesic $n$-gon criterion due to Arzhantseva, which applies in any $δ$-hyperbolic space. As a consequence, combined with rapid decay, $G$ is $C^*$-selfless. The condition is mild: torsion-free Tarski monsters, Jacobson's mixed-identity-free elementary amenable groups and Gromov monster groups satisfy it for appropriate parameter choices. The amenable examples are selfless but cannot be $C^*$-selfless, providing examples that separate these properties. Finally we remark that the Gromov monster group examples provide a potential avenue to a non-exact $C^*$-algebra with strict comparison.
We prove that every graph admits a linked, componental, rooted tree-cut decomposition of finite adhesion that displays all undominated edge-ends. As a first application, we deduce that this tree-cut decomposition also displays the edge-degrees of all undominated edge-ends. For locally finite graphs $-$ where every end is an undominated edge-end $-$ this yields a linked tree-cut decomposition of finite adhesion into $\textit{finite}$ parts that displays all ends and their edge-degrees. As a second application, this latter tree-cut decomposition yields short, unified deductions of Thomassen's theorem on boundary-linked finite partitions, and of Bruhn and Stein's characterisation of Eulerian locally finite graphs in terms of even ends.
We report some work in progress on the relationship between étale and quasicoherent cohomological dimensions.
Torsion pairs, and in particular t-structures, play a central role in the study of triangulated categories. Specifically, t-structures induced by silting (or tilting) objects often admit desirable properties with strong connections to derived equivalences. In this paper, using the correspondence of Saorín-Šťovíček between cohereditary cotorsion pairs in Frobenius exact categories and t-structures in their stable categories, we construct a family of t-structures in the $Q$-shaped derived category of Holm and Jorgensen, arising from admissible partitions of $Q$. We give an explicit description of the associated cotorsion pairs inside the Frobenius exact category of the bifibrant objects, and we identify the corresponding co-aisles by certain homological vanishing conditions. Such t-structures are proved to be induced by a silting object, that can be completely determined by the combinatorics of $Q$. Finally, we illustrate our results by recovering well-known equivalences in the $Q$-shaped setting, while also providing examples where the combinatorial conditions fail (e.g. cyclic quivers), showing that such categories may admit no non-trivial t-structures, revealing phenomena analogous to those observed by Linckelmann in stable module categories.
We establish a direct high-probability last-iterate guarantee for the standard same-sample two-point Gaussian zeroth-order stochastic-gradient method applied to smooth, strongly convex stochastic optimization. At each iteration, the method draws a fresh Gaussian direction, evaluates the objective at two symmetric perturbations using the same stochastic sample, and takes a norm-normalized stochastic-approximation step. Assuming unbiased stochastic gradients and a conditional exponential-moment bound on the squared norm of the stochastic-gradient noise, we prove that, whenever \(d\ge16\log(6T/δ)\), \[ f(\bx_T)-f(\bx^*) = \widetilde{\mathcal O}\!\left(\frac{d}{T}\right) \] with probability at least \(1-δ\), up to fixed problem parameters and logarithmic factors. The confidence dependence is therefore logarithmic rather than polynomial in \(1/δ\). The analysis is direct: it neither invokes Markov's inequality to convert an expectation bound nor truncates the noise. We are not aware of a prior direct high-probability last-iterate result at this zeroth-order scale for the same-sample Gaussian recursion under conditional sub-Gaussian stochastic-gradient noise. The proof combines a uniform weighted scan for Gaussian angles with an angle-enlarged product-martingale boundary that controls the signed suffix-product term arising from the unrolled stochastic recursion.
Start with four digits, arrange them in both descending and ascending order, subtract, and repeat. This simple process is known as the Kaprekar routine, famous in base ten for sending every nonconstant four-digit string to $6174$. We show that in every odd base $B>3$, the four-digit Kaprekar map has an unexpectedly rigid structure. After at most three iterations, every nonconstant orbit enters an explicit triangular region $\mathcal{T}_B$, and on this region the map is conjugate to projective doubling: \[ \{[r],[s]\}\longmapsto \{[2r],[2s]\}. \] This gives a complete finite description of all nonconstant terminal cycles, including an explicit formula for their lengths and counts. In particular, the longest terminal cycle has length at most $(B-1)/2$, and equality can occur only when $B$ is prime. For primes $p>5$, equality occurs precisely when the least positive $m$ with $2^m\equiv\pm1\pmod p$ is $m=(p-1)/2$. The results proved here were first formulated by Schwartz and Thakur. As a test case for AI-assisted formal mathematics, AxiomProver produced Lean/mathlib formalizations of these results.
Any linear space of square matrices has an associated eigenvector variety. Its points are eigenvectors of matrices from that linear space. We present a systematic study of eigenvector varieties, with focus on Lie algebras and Hamiltonians of quantum systems.
We study interactions between simplex faces of lattice polytopes and quadratic generation of toric ideals. We prove that, under a mild condition on edges, if the toric ideal of a lattice polytope is generated by quadratic binomials, then every clique of its 1-skeleton is the vertex set of a face. In particular, if the toric ideal of a $(0,1)$-polytope is generated by quadratic binomials, then every clique of its 1-skeleton is the vertex set of a face. For $(0,1)$-polytopes satisfying condition (E), we characterize this clique-face property in terms of divisibility by quadratic monomials appearing in quadratic binomials of the toric ideal; as a consequence, such toric ideals have no indispensable monomials of degree $\ge 3$. We apply these results to edge polytopes and cut polytopes, for which the clique-face property is equivalent to quadratic generation. Finally, motivated by conjectures on quadratic toric ideals, we verify the clique-face property for simple polytopes, matroid independence polytopes, and matroid base polytopes, and discuss stable set polytopes.
We introduce the $G$-Daugavet property ($G$-DPr, for short) for Banach spaces endowed with an action of a group $G$ by surjective linear isometries. This notion provides a common framework for the classical Daugavet property and the alternative Daugavet property, which correspond respectively to the trivial action and to the scalar action of $S_{\mathbb{K}}$. We establish several characterizations of the $G$-DPr in terms of $G$-slices and closed convex $G$-invariant hulls, recovering the usual slice descriptions of the DPr and the aDPr as particular cases. We show that the presence of a group action leads to new behavior in Daugavet theory. In particular, the $G$-DPr may hold on classical reflexive spaces in sharp contrast with the classical Daugavet property. We relate this phenomenon to convex transitivity, almost transitivity and finite-dimensional rotation problems. We also prove group-action versions of the classical characterizations for $L^1(μ, X)$- and $C(K,X)$-spaces. The paper also studies group separable determination, $G$-versions of numerical radius and numerical index, and connections between the $G$-DPr and strong Radon-Nikodým and SCD operators. Finally, we introduce a parameter which measures how far the $G$-DPr is from the classical DPr in a quantitative manner. As a consequence of these results, we obtain conditions under which the $G$-DPr recovers several classical implications, including the failure of the RNP for both $X$ and $X^*$, the presence of copies of $\ell_1$ and the failure of the unit ball to be an SCD set.
We reconcile privacy protection and rate-double-robust inference. The privacy of individuals is protected by a local privacy mechanism: injecting noise into their sensitive data, revealing only the noisy data for inference. Hence, privacy protection hinders inference. In contrast, the inference of a target parameter is rate-double-robust when the large-sample bias of an estimator of the parameter is characterised by a trade-off between the estimation errors of two other, nuisance, parameters. Hence, rate-double-robustness facilitates inference. Our starting point of reconciliation is a class of rate-double-robust target parameters indexed linearly by an infinite-dimensional and nonlinearly by a low-dimensional regression. Among others, this includes causal parameters. To infer these targets privately, we show how suitable privacy mechanisms transfer the semiparametric properties of the sensitive-data model to the private setting. Rate-double-robustness is transferred, enabling locally-private, unbiased and semiparametrically efficient inference of our target parameters. Finally, we transform general nonparametric nuisance estimators into private ones, which inherit convergence properties of their nonprivate counterparts. For parametric nuisance models, we develop a private method-of-moments estimator and its large-sample inference theory.
We study Ziegler pairs of line arrangements from both numerical and homological perspectives. First, we show that for arrangements of $d<9$ lines the intersection lattice determines the exponent data considered here. Then we list six distinct Ziegler pair with $d=10$. In particular, we construct higher-degree examples with the same intersection lattice, the same minimal degree of a Jacobian relation, and the same Hilbert function of the Milnor algebra, but with different minimal graded free resolutions.
A c-direct category is a small category equipped with an ordinal degree function such that every morphism is level or degree-raising. Every c-direct category is c-Reedy. The c-Reedy model structure on any functor category from a c-direct category to a model category coincides with the projective model structure. In this framework, a realization functor is a colimit-preserving functor satisfying some mild homotopical conditions from the category of presheaves on a c-direct category with cofibrant representables to a model category. We prove that any two such realization functors are weakly equivalent on cofibrant presheaves. For categories of cubes, we prove that thick categories have cofibrant representables. As an application, we introduce the $\varepsilon$-branching space of an $\mathcal A$-set for any thick category of cubes $\mathcal A$. It is obtained as a coend over a c-direct category with cofibrant representables constructed from $\mathcal A$. We prove that, on free $\mathcal A$-sets generated by precubical sets, this new definition coincides with the earlier one. We prove that, for cofibrant $\mathcal A$-sets, the resulting space is independent of $\varepsilon$ up to homotopy.