2025-07-08 | | Total: 52
Bandits with knapsacks (BwK) constitute a fundamental model that combines aspects of stochastic integer programming with online learning. Classical algorithms for BwK with a time horizon T achieve a problem-independent regret bound of O(√T) and a problem-dependent bound of O(logT). In this paper, we initiate the study of the BwK model in the setting of quantum computing, where both reward and resource consumption can be accessed via quantum oracles. We establish both problem-independent and problem-dependent regret bounds for quantum BwK algorithms. For the problem-independent case, we demonstrate that a quantum approach can improve the classical regret bound by a factor of (1+√B/OPTLP), where B is budget constraint in BwK and OPTLP denotes the optimal value of a linear programming relaxation of the BwK problem. For the problem-dependent setting, we develop a quantum algorithm using an inexact quantum linear programming solver. This algorithm achieves a quadratic improvement in terms of the problem-dependent parameters, as well as a polynomial speedup of time complexity on problem's dimensions compared to classical counterparts. Compared to previous works on quantum algorithms for multi-armed bandits, our study is the first to consider bandit models with resource constraints and hence shed light on operations research.
This paper identifies necessary and sufficient conditions for the exactness of penalty functions in optimization problems whose constraint sets are not necessarily bounded. The case where the data of problems is locally Lipschitz or semi-algebraic is studied in detail. The conditions are given in terms of properties of the objective and residual functions of the problems in question. The obtained results generalize and improve some known results in the literature on exact penalty functions.
Deep neural policies have unlocked agile flight for quadcopters, adaptive grasping for manipulators, and reliable navigation for ground robots, yet their millions of weights conflict with the tight memory and real-time constraints of embedded microcontrollers. Second-order pruning methods, such as Optimal Brain Damage (OBD) and its variants, including Optimal Brain Surgeon (OBS) and the recent SparseGPT, compress networks in a single pass by leveraging the local Hessian, achieving far higher sparsity than magnitude thresholding. Despite their success in vision and language, the consequences of such weight removal on closed-loop stability, tracking accuracy, and safety have remained unclear. We present the first mathematically rigorous robustness analysis of second-order pruning in nonlinear discrete-time control. The system evolves under a continuous transition map, while the controller is an L-layer multilayer perceptron with ReLU-type activations that are globally 1-Lipschitz. Pruning the weight matrix of layer k replaces Wk with Wk+δWk, producing the perturbed parameter vector ˆΘ=Θ+δΘ and the pruned policy π(⋅;ˆΘ). For every input state s∈X we derive the closed-form inequality ‖ where the constant C_k(s) depends only on unpruned spectral norms and biases, and can be evaluated in closed form from a single forward pass. The derived bounds specify, prior to field deployment, the maximal admissible pruning magnitude compatible with a prescribed control-error threshold. By linking second-order network compression with closed-loop performance guarantees, our work narrows a crucial gap between modern deep-learning tooling and the robustness demands of safety-critical autonomous systems.
We present \texttt{MathOptAI.jl}, an open-source Julia library for embedding trained machine learning predictors into a JuMP model. \texttt{MathOptAI.jl} can embed a wide variety of neural networks, decision trees, and Gaussian Processes into a larger mathematical optimization model. In addition to interfacing a range of Julia-based machine learning libraries such as \texttt{Lux.jl} and \texttt{Flux.jl}, \texttt{MathOptAI.jl} uses Julia's Python interface to provide support for PyTorch models. When the PyTorch support is combined with \texttt{MathOptAI.jl}'s gray-box formulation, the function, Jacobian, and Hessian evaluations associated with the PyTorch model are offloaded to the GPU in Python, while the rest of the nonlinear oracles are evaluated on the CPU in Julia. \MathOptAI is available at https://github.com/lanl-ansi/MathOptAI.jl under a BSD-3 license.
Recent advances in Model Predictive Control (MPC) leveraging a combination of first-order methods, such as the Alternating Direction Method of Multipliers (ADMM), and offline precomputation and caching of select operations, have excitingly enabled real-time MPC on microcontrollers. Unfortunately, these approaches require the use of fixed hyperparameters, limiting their adaptability and overall performance. In this work, we introduce First-Order Adaptive Caching, which precomputes not only select matrix operations but also their sensitivities to hyperparameter variations, enabling online hyperparameter updates without full recomputation of the cache. We demonstrate the effectiveness of our approach on a number of dynamic quadrotor tasks, achieving up to a 63.4% reduction in ADMM iterations over the use of optimized fixed hyperparameters and approaching 70% of the performance of a full cache recomputation, while reducing the computational cost from O(n^3) to O(n^2) complexity. This performance enables us to perform figure-eight trajectories on a 27g tiny quadrotor under wind disturbances. We release our implementation open-source for the benefit of the wider robotics community.
A classical approach to the Calderón problem is to estimate the unknown conductivity by solving a nonlinear least-squares problem. It is generally believed that it leads to a nonconvex optimization problem which is riddled with bad local minimums. This has motivated the development of reconstruction methods based on convex optimization, one recent contribution being the nonlinear convex semidefinite programming approach of Harrach (2023). In this work, we investigate the computational viability of this convex approach in a simple setting where the conductivities are piecewise constant and radial. We implement this convex reconstruction method and compare it extensively to the least squares approach. Our experiments suggest that this convex programming approach only allows to accurately estimate the unknown for problems with a very small size. Moreover, surprisingly, it is consistently outperformed by Newton-type least squares solvers, which are also faster and require less measurements. We revisit the issue of nonconvexity in this piecewise constant radial setting and prove that, contrary to previous claims, there are no local minimums in the case of two scalar unknowns with no measurement noise. We also provide a partial proof of this result in the general setting which holds under a numerically verifiable assumption.
The z-transform of a sequence is a classical tool used within signal processing, control theory, computer science, and electrical engineering. It allows for studying sequences from their generating functions, with many operations that can be equivalently defined on the original sequence and its z-transform. In particular, the z-transform method focuses on asymptotic behaviors and allows the use of Taylor expansions. We present a sequence of results of increasing significance and difficulty for linear models and optimization algorithms, demonstrating the effectiveness and versatility of the z-transform method in deriving new asymptotic results. Starting from the simplest gradient descent iterations in an infinite-dimensional Hilbert space, we show how the spectral dimension characterizes the convergence behavior. We then extend the analysis to Nesterov acceleration, averaging techniques, and stochastic gradient descent.
The Generalized Nash Equilibrium Problem refers to the question of the existence of a Nash equilibrium in an abstract economy. This model is due to Kenneth J. Arrow and Gerard Debreu in their pioneering work from 1954. An abstract economy is an extension of John Nash's original concept of a non-cooperative game from the 1950's. Here players selfishly seek to maximize their profits, which may depend on the others' choices. The novelty of an abstract economy is that the players may now mutually constrain each-other in their decision-making. A Nash equilibrium is reached when no player alone can increase his profit by a unilateral change of strategy. Abstract economies have found widespread applications from welfare economy, economic analysis and policy-making to constrained optimization, partial differential equations and optimal allocation. We generalize Leigh Tesfatsion's Nash equilibrium existence result to abstract economies without resorting to commonly employed convexity assumptions, thereby also re-proving all known Nash equilibrium existence results as special cases. A key element of our proof is the translation of the Generalized Nash Equilibrium Problem into the question of the existence of a coincidence of two particularly defined maps, and the application of a coincidence result from algebraic topology due to Samuel Eilenberg and Deane Montgomery in 1946.
We study a generalization of Eulerian polynomials to the multivariate setting introduced by Brändén. Although initially these polynomials were introduced using the language of hyperbolic and stable polynomials, we manage to translate some restrictions of these polynomials to our real zero setting. Once we are in this setting, we focus our attention on the rigidly convex sets (RCSs) defined by these polynomials. In particular, we study the corresponding rigidly convex sets looking at spectrahedral relaxations constructed through the use of monic symmetric linear matrix polynomials (MSLMPs) of small size and depending polynomially (actually just cubically) on the coefficients of the corresponding polynomials. We analyze how good are the obtained spectrahedral approximations to these rigidly convex sets. We do this analysis by measuring the behavior along the diagonal, where we precisely recover the original univariate Eulerian polynomials. Thus we conclude that, measuring through the diagonal, our relaxation-based spectrahedral method for approximation of the rigidly convex sets defined by multivariate Eulerian polynomials is highly accurate. In particular, we see that this relaxation-based spectrahedral method for approximation of the rigidly convex sets defined by multivariate Eulerian polynomials provides bounds for the extreme roots of the corresponding univariate Eulerian polynomials that are better than these already found in the literature. All in all, this tells us that, at least close to the diagonal, the global outer approximation to the rigidly convex sets provided by this relaxation-based spectrahedral method is itself highly accurate.
We propose a network behavioral-feedback Susceptible-Infected-Recovered (SIR) epidemic model in which the interaction matrix describing the infection rates across subpopulations depends in feedback on the current epidemic state. This model captures both heterogeneities in individuals mixing, contact frequency, aptitude to contract and spread the infection, and endogenous behavioral responses such as voluntary social distancing and the adoption of self-protective measures. We study the stability of the equilibria and illustrate through several examples how the shape of the stability region depends on the structure of the interaction matrix, providing insights for the design of effective control strategies. We then analyze the transient behavior of the dynamics, showing that, for a special class of rank-1 interaction matrices, there always exists an aggregate infection curve that exhibits a unimodal behavior, expanding the results on the unimodality of infection curve known in the literature of epidemic models and paving the way for future control applications.
Many applications have been identified which require the deployment of large-scale low-power wireless sensor networks. Some of the deployment environments, however, impose harsh operation conditions due to intense cross-technology interference, extreme weather conditions (heavy rainfall, excessive heat, etc.), or rough motion, thereby affecting the quality and predictability of the wireless links the nodes establish. In localization tasks, these conditions often lead to significant errors in estimating the position of target nodes. Motivated by the practical deployments of sensors on the surface of different water bodies, we address the problem of identifying susceptible nodes and robustly estimating their positions. We formulate these tasks as a compressive sensing problem and propose algorithms for both node identification and robust estimation. Additionally, we design an optimal anchor configuration to maximize the robustness of the position estimation task. Our numerical results and comparisons with competitive methods demonstrate that the proposed algorithms achieve both objectives with a modest number of anchors. Since our method relies only on target-to-anchor distances, it is broadly applicable and yields resilient, robust localization.
The ability to train Deep Neural Networks (DNNs) with constraints is instrumental in improving the fairness of modern machine-learning models. Many algorithms have been analysed in recent years, and yet there is no standard, widely accepted method for the constrained training of DNNs. In this paper, we provide a challenging benchmark of real-world large-scale fairness-constrained learning tasks, built on top of the US Census (Folktables). We point out the theoretical challenges of such tasks and review the main approaches in stochastic approximation algorithms. Finally, we demonstrate the use of the benchmark by implementing and comparing three recently proposed, but as-of-yet unimplemented, algorithms both in terms of optimization performance, and fairness improvement. We release the code of the benchmark as a Python package at https://github.com/humancompatible/train.
We derive the divergence-kernel formula for the scores of random dynamical systems, then formally pass to the continuous-time limit of SDEs. Our formula works for multiplicative noise systems over any period of time; it does not require hyperbolicity. We also consider several special cases: (1) for additive noise, we give a pure kernel formula; (2) for short-time, we give a pure divergence formula; (3) we give a formula which does not involve scores of the initial distribution. Based on the new formula, we derive a pathwise Monte-Carlo algorithm for scores, and demonstrate it on the 40-dimensional Lorenz 96 system with multiplicative noise.
A long-time behavior of solutions to a nonlinear plate model subject to non-conservative and non-dissipative effects and nonlinear damping is considered. The model under study is a prototype for a suspension bridge under the effects of unstable flow of gas. To counteract the unwanted oscillations a damping mechanism of a nonlinear nature is applied. From the point of view of nonlinear PDEs, we are dealing with a non-dissipative and nonlinear second order in time dynamical system of hyperbolic nature subjected to nonlinear damping. One of the first goals is to establish ultimate dissipativity of all solutions, which will imply an existence of a weak attractor. The combined effects of non-dissipative forcing with nonlinear damping-leading to an overdamping-give rise to major challenges in proving an existence of an absorbing set. Known methods based on equipartition of the energy do not suffice. A rather general novel methodology based on ``barrier's'' method will be developed to address this and related problems. Ultimately, it will be shown that a weak attractor becomes strong, and the nonlinear PDE system has a coherent finite-dimensional asymptotic behavior.
We study adaptive two-sided assortment optimization for revenue maximization in choice-based matching platforms. The platform has two sides of agents, an initiating side, and a responding side. The decision-maker sequentially selects agents from the initiating side, shows each an assortment of agents from the responding side, and observes their choices. After processing all initiating agents, the responding agents are shown assortments and make their selections. A match occurs when two agents mutually select each other, generating pair-dependent revenue. Choices follow Multinomial Logit (MNL) models. This setting generalizes prior work focused on maximizing the number of matches under submodular demand assumptions, which do not hold in our revenue-maximization context. Our main contribution is the design of polynomial-time approximation algorithms with constant-factor guarantees. In particular, for general pairwise revenues, we develop a randomized algorithm that achieves a (\frac{1}{2} - \epsilon)-approximation in expectation for any \epsilon > 0. The algorithm is static and provides guarantees under various agent arrival settings, including fixed order, simultaneous processing, and adaptive selection. When revenues are uniform across all pairs involving any given responding-side agent, the guarantee improves to (1 - \frac{1}{e} - \epsilon). In structural settings where responding-side agents share a common revenue-based ranking, we design a simpler adaptive deterministic algorithm achieving a \frac{1}{2}-approximation. Our approach leverages novel linear programming relaxations, correlation gap arguments, and structural properties of the revenue functions.
First-order methods in convex optimization offer low per-iteration cost but often suffer from slow convergence, while second-order methods achieve fast local convergence at the expense of costly Hessian inversions. In this paper, we highlight a middle ground: minimizing a quadratic majorant with fixed curvature at each iteration. This strategy strikes a balance between per-iteration cost and convergence speed, and crucially allows the reuse of matrix decompositions, such as Cholesky or spectral decompositions, across iterations and varying regularization parameters. We introduce the Quadratic Majorization Minimization with Extrapolation (QMME) framework and establish its sequential convergence properties under standard assumptions. The new perspective of our analysis is to center the arguments around the induced norm of the curvature matrix H. To demonstrate practical advantages, we apply QMME to large-scale kernel regularized learning problems. In particular, we propose a novel Sylvester equation modelling technique for kernel multinomial regression. In Julia-based experiments, QMME compares favorably against various established first- and second-order methods. Furthermore, we demonstrate that our algorithms complement existing kernel approximation techniques through more efficiently handling sketching matrices with large projection dimensions. Our numerical experiments and real data analysis are available and fully reproducible at https://github.com/qhengncsu/QMME.jl.
Recent years have seen rapid increases in intermittent renewable generation, requiring novel battery energy storage systems (BESS) solutions. One recent trend is the emergence of large grid-connected batteries, that can be controlled to provide multiple storage and flexibility services, using a stacked revenue model. Another emerging development is renewable energy communities (REC), in which prosumers invest in their own renewable generation capacity, but also requiring battery storage for flexibility. In this paper, we study settings in which energy communities rent battery capacity from a battery operator through a battery-as-a-service (BaaS) model. We present a methodology for determining the sizing and pricing of battery capacity that can be rented, such that it provides economic benefits to both the community and the battery operator that participates in the energy market. We examine how sizes and prices vary across a number of different scenarios for different types of tariffs (flat, dynamic) and competing energy market uses. Second, we conduct a systematic study of linear optimization models for battery control when deployed to provide flexibility to energy communities. We show that existing approaches for battery control with daily time windows have a number of important limitations in practical deployments, and we propose a number of regularization functions in the optimization to address them. Finally, we investigate the proposed method using real generation, demand, tariffs, and battery data, based on a practical case study from a large battery operator in the Netherlands. For the settings in our case study, we find that a community of 200 houses with a 330 kW wind turbine can save up to 12,874 euros per year by renting just 280 kWh of battery capacity (after subtracting battery rental costs), with the methodology applicable to a wide variety of settings and tariff types.
We study reinforcement learning (RL) in the agnostic policy learning setting, where the goal is to find a policy whose performance is competitive with the best policy in a given class of interest \Pi -- crucially, without assuming that \Pi contains the optimal policy. We propose a general policy learning framework that reduces this problem to first-order optimization in a non-Euclidean space, leading to new algorithms as well as shedding light on the convergence properties of existing ones. Specifically, under the assumption that \Pi is convex and satisfies a variational gradient dominance (VGD) condition -- an assumption known to be strictly weaker than more standard completeness and coverability conditions -- we obtain sample complexity upper bounds for three policy learning algorithms: \emph{(i)} Steepest Descent Policy Optimization, derived from a constrained steepest descent method for non-convex optimization; \emph{(ii)} the classical Conservative Policy Iteration algorithm \citep{kakade2002approximately} reinterpreted through the lens of the Frank-Wolfe method, which leads to improved convergence results; and \emph{(iii)} an on-policy instantiation of the well-studied Policy Mirror Descent algorithm. Finally, we empirically evaluate the VGD condition across several standard environments, demonstrating the practical relevance of our key assumption.
We study the periodic assignment problem, in which a set of periodically repeating tasks must be assigned to workers within a repeating schedule. The classical efficiency objective is to minimize the number of workers required to operate the schedule. We propose a O(n log n) algorithm to solve this problem. Next, we formalize a notion of fairness among workers, and impose that each worker performs the same work over time. We analyze the resulting trade-off between efficiency and fairness, showing that the price of fairness is at most one extra worker, and that such a fair solution can always be found using the Nearest Neighbor heuristic. We characterize all instances that admit a solution that is both fair and efficient, and use this result to develop a O(n log n) exact algorithm for the fair periodic assignment problem. Finally, we show that allowing aperiodic schedules never reduces the price of fairness.
The design of transfers to periodic orbits in the Earth--Moon system has regained prominence with NASA's Artemis and CNSA's Chang'e programs. This work addresses the problem of linking ballistic capture trajectories - exploiting multi-body dynamics for temporary lunar orbit insertion - with bounded periodic motion described in the circular restricted three-body problem (CR3BP). A unified framework is developed for optimizing bi-impulsive transfers to families of periodic orbits via a high-order polynomial expansion of the CR3BP dynamics. That same expansion underlies a continuous 'abacus' parameterization of orbit families, enabling rapid targeting and analytic sensitivity. Transfers to planar periodic-orbit families (Lyapunov L1 and L2, and distant retrograde orbits) are addressed first, followed by extension to spatial families, such as butterfly and halo L1/L2 orbits, with an emphasis towards Near-Rectilinear Halo Orbits (NRHOs). Numerical results demonstrate low-cost solutions and validate the method's adaptability for the design of lunar missions. The optimized trajectories can inform an established low-energy transfer database, enriching it with detailed cost profiles that reflect both transfer feasibility and underlying dynamical relationships to specific periodic-orbit families. Finally, the proposed transfers provide reliable initial guesses for rapid refinement, readily adaptable for further optimization across mission-specific needs.
In 2022, we published a book, \emph{Maximum-Entropy Sampling: Algorithms and Application (Springer)}. Since then, there have been several notable advancements on this topic. In this manuscript, we survey some recent highlights.
Modern transportation network modeling increasingly involves the integration of diverse methodologies including sensor-based forecasting, reinforcement learning, classical flow optimization, and demand modeling that have traditionally been developed in isolation. This paper introduces Flow Through Tensors (FTT), a unified computational graph architecture that connects origin destination flows, path probabilities, and link travel times as interconnected tensors. Our framework makes three key contributions: first, it establishes a consistent mathematical structure that enables gradient-based optimization across previously separate modeling elements; second, it supports multidimensional analysis of traffic patterns over time, space, and user groups with precise quantification of system efficiency; third, it implements tensor decomposition techniques that maintain computational tractability for large scale applications. These innovations collectively enable real time control strategies, efficient coordination between multiple transportation modes and operators, and rigorous enforcement of physical network constraints. The FTT framework bridges the gap between theoretical transportation models and practical deployment needs, providing a foundation for next generation integrated mobility systems.
We present a mathematical model for optimizing breakaway strategies in competitive cycling, balancing power expenditure, aerodynamic drag, and crashing. Our framework incorporates probabilistic crash dynamics, allowing a cyclist's risk tolerance to shape optimal tactics. We define an objective function that accounts for both finish time differences and the probability of crashing, which we optimize subject to an energy expenditure constraint. We demonstrate the methodology for a flat stage with a simple constant-power breakaway. We then extend this analysis to account for fatigue-driven power decay, and varying terrain and race conditions. We highlight the importance of strategy by demonstrating that carefully planned decision making can lead to a race win even when the energy expenditure is low. Our results highlight and quantify the fact that, at the elite level, success often depends as much on minimizing risk as on maximizing physical output.
In this work, we address the exact D-optimal experimental design problem by proposing an efficient algorithm that rapidly identifies the support of its continuous relaxation. Our method leverages a column generation framework to solve such a continuous relaxation, where each restricted master problem is tackled using a Primal-Dual Interior-Point-based Semidefinite Programming solver. This enables fast and reliable detection of the design's support. The identified support is subsequently used to construct a feasible exact design that is provably close to optimal. We show that, for large-scale instances in which the number of regression points exceeds by far the number of experiments, our approach achieves superior performance compared to existing branch-and-bound-based algorithms in both computational efficiency and solution quality.
Developing algorithms for real-life problems that perform well in practice highly depends on the availability of realistic data for testing. Obtaining real-life data for optimization problems in health care, however, is often difficult. This is especially true for any patient related optimization problems, e.g., for patient-to-room assignment, due to data privacy policies. Furthermore, obtained real-life data usually cannot be published which prohibits reproducibility of results by other researchers. Therefore, often artificially generated instances are used. In this paper, we present combinatorial insights about the feasibility of instances for the patient-to-room assignment problem (PRA). We use these insights to develop a configurable instance generator for PRA with an easy-to-use graphical user interface. Configurability is in this case especially important as we observed in an extensive analysis of real-life data that, e.g., the probability distribution for patients' age and length of stay depends on the respective ward.