2024-11-01 | | Total: 33

This paper develops a data-driven stabilization method for continuous-time linear time-invariant systems with theoretical guarantees and no need for signal derivatives. The framework, based on linear matrix inequalities (LMIs), is illustrated in the state-feedback and single-input single-output output-feedback scenarios. Similar to discrete-time approaches, we rely solely on input and state/output measurements. To avoid differentiation, we employ low-pass filters of the available signals that, rather than approximating the derivatives, reconstruct a non-minimal realization of the plant. With access to the filter states and their derivatives, we can solve LMIs derived from sample batches of the available signals to compute a dynamic controller that stabilizes the plant. The effectiveness of the framework is showcased through numerical examples.

In recent years, there have been significant advances in efficiently solving $\ell_s$-regression using linear system solvers and $\ell_2$-regression [Adil-Kyng-Peng-Sachdeva, J. ACM'24]. Would efficient $\ell_p$-norm solvers lead to even faster rates for solving $\ell_s$-regression when $2 \leq p < s$? In this paper, we give an affirmative answer to this question and show how to solve $\ell_s$-regression using $\tilde{O}(n^{\frac{\nu}{1+\nu}})$ iterations of solving smoothed $\ell_s$ regression problems, where $\nu := \frac{1}{p} - \frac{1}{s}$. To obtain this result, we provide improved accelerated rates for convex optimization problems when given access to an $\ell_p^s(\lambda)$-proximal oracle, which, for a point $c$, returns the solution of the regularized problem $\min_{x} f(x) + \lambda \|x-c\|_p^s$. Additionally, we show that the rates we establish for the $\ell_p^s(\lambda)$-proximal oracle are near-optimal.

In recent years, there has been a surge of interest in studying different ways to reformulate nonconvex optimization problems, especially those that involve binary variables. This interest surge is due to advancements in computing technologies, such as quantum and Ising devices, as well as improvements in quantum and classical optimization solvers that take advantage of particular formulations of nonconvex problems to tackle their solutions. Our research characterizes the equivalence between equality-constrained nonconvex optimization problems and their Lagrangian relaxation, enabling the aforementioned new technologies to solve these problems. In addition to filling a crucial gap in the literature, our results are readily applicable to many important situations in practice. To obtain these results, we bridge between specific optimization problem characteristics and broader, classical results on Lagrangian duality for general nonconvex problems. Further, our approach takes a comprehensive approach to the question of equivalence between problem formulations. We consider this question not only from the perspective of the problem's objective but also from the viewpoint of its solution. This perspective, often overlooked in existing literature, is particularly relevant for problems featuring continuous and binary variables.

Operations and maintenance (O&M) is a fundamental problem in wind energy systems with far reaching implications for reliability and profitability. Optimizing O&M is a multi-faceted decision optimization problem that requires a careful balancing act across turbine level failure risks, operational revenues, and maintenance crew logistics. The resulting O&M problems are typically solved using large-scale mixed integer programming (MIP) models, which yield computationally challenging problems that require either long-solution times, or heuristics to reach a solution. To address this problem, we introduce a novel decision-making framework for wind farm O&M that builds on a multi-head attention (MHA) models, an emerging artificial intelligence methods that are specifically designed to learn in rich and complex problem settings. The development of proposed MHA framework incorporates a number of modeling innovations that allows explicit embedding of MIP models within an MHA structure. The proposed MHA model (i) significantly reduces the solution time from hours to seconds, (ii) guarantees feasibility of the proposed solutions considering complex constraints that are omnipresent in wind farm O&M, (iii) results in significant solution quality compared to the conventional MIP formulations, and (iv) exhibits significant transfer learning capability across different problem settings.

The well-posedness of a class of optimal control problems is analysed, where the state equation couples a nonlinear degenerate Fokker-Planck equation with a system of Ordinary Differential Equations (ODEs). Such problems naturally arise as mean-field limits of Stochastic Differential models for multipopulation dynamics, where a large number of agents (followers) is steered through parsimonious intervention on a selected class of leaders. The proposed approach combines stability estimates for measure solutions of nonlinear degenerate Fokker-Planck equations with a general framework of assumptions on the cost functional, ensuring compactness and lower semicontinuity properties. The Lie structure of the state equations allows one for considering non-Lipschitz nonlinearities, provided some suitable dissipativity assumptions are considered in addition to non-Euclidean Hölder and sublinearity conditions.

Given an integer or a non-negative integer solution $x$ to a system $Ax = b$, where the number of non-zero components of $x$ is at most $n$. This paper addresses the following question: How closely can we approximate $b$ with $Ay$, where $y$ is an integer or non-negative integer solution constrained to have at most $k$ non-zero components with $k<n$? We establish upper and lower bounds for this question in general. In specific cases, these bounds match. The key finding is that the quality of the approximation increases exponentially as $k$ goes to $n$.

We consider multiperiod stochastic control problems with non-parametric uncertainty on the underlying probabilistic model. We derive a new metric on the space of probability measures, called the adapted $(p, \infty)$--Wasserstein distance $\mathcal{AW}_p^\infty$ with the following properties: (1) the adapted $(p, \infty)$--Wasserstein distance generates a topology that guarantees continuity of stochastic control problems and (2) the corresponding $\mathcal{AW}_p^\infty$-distributionally robust optimization (DRO) problem can be computed via a dynamic programming principle involving one-step Wasserstein-DRO problems. If the cost function is semi-separable, then we further show that a minimax theorem holds, even though balls with respect to $\mathcal{AW}_p^\infty$ are neither convex nor compact in general. We also derive first-order sensitivity results.

In this paper, we study linearly constrained optimization problems (LCP). After applying Hadamard parametrization, the feasible set of the parametrized problem (LCPH) becomes an algebraic variety, with conducive geometric properties which we explore in depth. We derive explicit formulas for the tangent cones and second-order tangent sets associated with the parametrized polyhedra. Based on these formulas, we develop a procedure to recover the Lagrangian multipliers associated with the constraints to verify the optimality conditions of the given primal variable without requiring additional constraint qualifications. Moreover, we develop a systematic way to stratify the variety into a disjoint union of finitely many Riemannian manifolds. This leads us to develop a hybrid algorithm combining Riemannian optimization and projected gradient to solve (LCP) with convergence guarantees. Numerical experiments are conducted to verify the effectiveness of our method compared with various state-of-the-art algorithms.

This paper investigates a path-following method inspired by the semismooth$^*$ approach for solving algebraic inclusions, with a primary emphasis on the role of uniform subregularity. Uniform subregularity is crucial for ensuring the robustness and stability of path-following methods, as it provides a framework to uniformly control the distance between the input and the solution set across a continuous path. We explore the problem of finding a mapping $ x: \mathbb{R} \longrightarrow \mathbb{R}^n $ that satisfies $ 0 \in F(t, x(t)) $ for each $ t \in [0, T] $, where $ F $ is a set-valued mapping from $ \mathbb{R} \times \mathbb{R}^n $ to $ \mathbb{R}^n $. The paper discusses two approaches: the first considers mappings with uniform semismooth$^*$ properties along continuous paths, leading to a consistent grid error throughout the interval, while the second examines mappings exhibiting pointwise semismooth$^*$ properties at individual points along the path. The uniform strong subregularity framework is integrated into these approaches to strengthen the stability of solution trajectories and improve algorithmic convergence.

Semidefinite programming (SDP) problems are challenging to solve because of their high dimensionality. However, solving sparse SDP problems with small tree-width are known to be relatively easier because: (1) they can be decomposed into smaller multi-block SDP problems through chordal conversion; (2) they have low-rank optimal solutions. In this paper, we study more general SDP problems whose coefficient matrices have sparse plus low-rank (SPLR) structure. We develop a unified framework to convert such problems into sparse SDP problems with bounded tree-width. Based on this, we derive rank bounds for SDP problems with SPLR structure, which are tight in the worst case.

In this note we investigate the problem of global exponential synchronization of multi-agent systems described by nonlinear input affine dynamics. We consider the case of networks described by undirected connected graphs possibly without leader. We present a set of sufficient conditions based on a Riemannian metric approach in order to design a state-feedback distributed control law. Then, we study the convergence properties of the overall network. By exploiting the properties of the edge Laplacian we construct a Lyapunov function that allows to conclude global exponential synchronization of the overall network.

We study the problem of online convex optimization with memory and predictions over a horizon $T$. At each time step, a decision maker is given some limited predictions of the cost functions from a finite window of future time steps, i.e., values of the cost function at certain decision points in the future. The decision maker then chooses an action and incurs a cost given by a convex function that depends on the actions chosen in the past. We propose an algorithm to solve this problem and show that the dynamic regret of the algorithm decays exponentially with the prediction window length. Our algorithm contains two general subroutines that work for wider classes of problems. The first subroutine can solve general online convex optimization with memory and bandit feedback with $\sqrt{T}$-dynamic regret with respect to $T$. The second subroutine is a zeroth-order method that can be used to solve general convex optimization problems with a linear convergence rate that matches the best achievable rate of first-order methods for convex optimization. The key to our algorithm design and analysis is the use of truncated Gaussian smoothing when querying the decision points for obtaining the predictions. We complement our theoretical results using numerical experiments.

Passenger transportation is a core aspect of a railway company's business, with ticket sales playing a central role in generating revenue. Profitable operations in this context rely heavily on the effectiveness of reject-or-assign policies for coach reservations. As in traditional revenue management, uncertainty in demand presents a significant challenge, particularly when seat availability is limited and passengers have varying itineraries. We extend traditional models from the literature by addressing both offline and online versions of the coach reservation problem for group requests, where two or more passengers must be seated in the same coach. For the offline case, in which all requests are known in advance, we propose an exact mathematical programming formulation that incorporates a first-come, first-served fairness condition, ensuring compliance with transportation regulations. We also propose algorithms for online models of the problem, in which requests are only revealed upon arrival, and the reject-or-assign decisions must be made in real-time. Our analysis for one of these models overcomes known barriers in the packing literature, yielding strong competitive ratio guarantees when group sizes are relatively small compared to coach capacity - a common scenario in practice. Using data from Shinkansen Tokyo-Shin-Osaka line, our numerical experiments demonstrate the practical effectiveness of the proposed policies. Our work provides compelling evidence supporting the adoption of fairness constraints, as revenue losses are minimal, and simple algorithms are sufficient for real-time decision-making. Moreover, our findings provide a strong support for the adoption of fairness in the railway industry and highlight the financial viability of a regulatory framework that allows railway companies to delay coach assignments if they adhere to stricter rules regarding request rejections.

This paper addresses a novel \emph{cost-sensitive} distributionally robust log-optimal portfolio problem, where the investor faces \emph{ambiguous} return distributions, and a general convex transaction cost model is incorporated. The uncertainty in the return distribution is quantified using the \emph{Wasserstein} metric, which captures distributional ambiguity. We establish conditions that ensure robustly survivable trades for all distributions in the Wasserstein ball under convex transaction costs. By leveraging duality theory, we approximate the infinite-dimensional distributionally robust optimization problem with a finite convex program, enabling computational tractability for mid-sized portfolios. Empirical studies using S\&P 500 data validate our theoretical framework: without transaction costs, the optimal portfolio converges to an equal-weighted allocation, while with transaction costs, the portfolio shifts slightly towards the risk-free asset, reflecting the trade-off between cost considerations and optimal allocation.

This paper introduces a graph-based algorithm for solving single-item, single-location inventory lot-sizing problems under non-stationary stochastic demand using the $(R_t, S_t)$ policy and a penalty cost scheme. The proposed method relaxes the original mixed-integer linear programming (MILP) model by eliminating non-negative order quantity constraints and formulating it as a shortest-path problem on a weighted directed acyclic graph. A repetitive augmentation procedure is proposed to resolve any infeasibility in the solution. This procedure consists of three stages: (1) filtration, (2) repeated augmentation by redirecting, reconnecting, and duplicating between newly introduced and existing nodes to adjust the graph and eliminate negative replenishment orders, and (3) re-optimising. The effectiveness and computational efficiency of the proposed approach are assessed through extensive experiments on 1,620 test instances across various demand patterns and parameter settings. The results show that 195 instances required augmentation, mainly those with high penalty costs, low fixed ordering costs, large demand variability, and extended planning horizons. The efficiency of the algorithm for instances with extended planning horizon scenarios demonstrates its suitability for use in real-world scenarios.

In this paper, a heuristic for a heterogeneous min-max multi-vehicle multi-depot traveling salesman problem is proposed, wherein heterogeneous vehicles start from given depot locations and need to cover a given set of targets. In the considered problem, vehicles can be structurally heterogeneous due to different vehicle speeds and/or functionally heterogeneous due to different vehicle-target assignments originating from different sensing capabilities of vehicles. The proposed heuristic for the considered problem has three stages: an initialization stage to generate an initial feasible solution, a local search stage to improve the incumbent solution by searching through different neighborhoods, and a perturbation/shaking stage, wherein the incumbent solution is perturbed to break from a local minimum. In this study, three types of neighborhood searches are employed. Furthermore, two different methods for constructing the initial feasible solution are considered, and multiple variations in the neighborhoods considered are explored in this study. The considered variations and construction methods are evaluated on a total of 128 instances generated with varying vehicle-to-target ratios, distribution for generating the targets, and vehicle-target assignment and are benchmarked against the best-known heuristic for this problem. Two heuristics were finally proposed based on the importance provided to objective value or computation time through extensive computational studies.

The superiorization methodology (SM) is an optimization heuristic in which an iterative algorithm, which aims to solve a particular problem, is ``superiorized'' to promote solutions that are improved with respect to some secondary criterion. This superiorization is achieved by perturbing iterates of the algorithm in nonascending directions of a prescribed function that penalizes undesirable characteristics in the solution; the solution produced by the superiorized algorithm should therefore be improved with respect to the value of this function. In this paper, we broaden the SM to allow for the perturbations to be introduced by an arbitrary procedure instead, using a plug-and-play approach. This allows for operations such as image denoisers or deep neural networks, which have applications to a broad class of problems, to be incorporated within the superiorization methodology. As proof of concept, we perform numerical simulations involving low-dose and sparse-view computed tomography image reconstruction, comparing the plug-and-play approach to a conventionally superiorized algorithm, as well as a post-processing approach. The plug-and-play approach provides comparable or better image quality in most cases, while also providing advantages in terms of computing time, and data fidelity of the solutions.

In recent years, the transition to clean bus fleets has accelerated. Although this transition might bring environmental and economic benefits, it requires a long-term strategic plan due to the large investment costs involved. This paper proposes a multi-stage stochastic program to optimize strategic plans for the clean bus fleet transition that explicitly considers the uncertainty scenarios in the cost and efficiency improvements of clean buses. Our optimization model minimizes the total expected cost subject to emission targets, budget restrictions and several other operational considerations. We propose a new forecasting approach that captures the correlation between these improvements to obtain realistic future pathways for Battery Electric Buses (BEBs) and Hydrogen Fuel Cell Buses (HFCBs), which are then given to the multi-stage stochastic program as scenarios. We also utilize a physics-based model for BEBs to accurately capture their energy consumption and recharging needs. As a case study, we focus on the complex public bus network of Istanbul, which aims to transition to a clean bus fleet by 2050. Utilizing real datasets, we solve a five-stage stochastic program spanning a 25-year planning horizon that involves 256 scenarios to obtain dynamic strategic plans that can be used by the policy makers. Our results suggest that BEBs are more advantageous than HFCBs, even in slow BEB but fast HFCB development scenarios. We also conduct several sensitivity analyses to understand the effects of the intermediate emission targets, budget limitations and energy prices.

The non-convex nature of trained neural networks has created significant obstacles in their incorporation into optimization models. Considering the wide array of applications that this embedding has, the optimization and deep learning communities have dedicated significant efforts to the convexification of trained neural networks. Many approaches to date have considered obtaining convex relaxations for each non-linear activation in isolation, which poses limitations in the tightness of the relaxations. Anderson et al. (2020) strengthened these relaxations and provided a framework to obtain the convex hull of the graph of a piecewise linear convex activation composed with an affine function; this effectively convexifies activations such as the ReLU together with the affine transformation that precedes it. In this article, we contribute to this line of work by developing a recursive formula that yields a tight convexification for the composition of an activation with an affine function for a wide scope of activation functions, namely, convex or ``S-shaped". Our approach can be used to efficiently compute separating hyperplanes or determine that none exists in various settings, including non-polyhedral cases. We provide computational experiments to test the empirical benefits of these convex approximations.

Gaussian processes (GPs) are non-parametric probabilistic regression models that are popular due to their flexibility, data efficiency, and well-calibrated uncertainty estimates. However, standard GP models assume homoskedastic Gaussian noise, while many real-world applications are subject to non-Gaussian corruptions. Variants of GPs that are more robust to alternative noise models have been proposed, and entail significant trade-offs between accuracy and robustness, and between computational requirements and theoretical guarantees. In this work, we propose and study a GP model that achieves robustness against sparse outliers by inferring data-point-specific noise levels with a sequential selection procedure maximizing the log marginal likelihood that we refer to as relevance pursuit. We show, surprisingly, that the model can be parameterized such that the associated log marginal likelihood is strongly concave in the data-point-specific noise variances, a property rarely found in either robust regression objectives or GP marginal likelihoods. This in turn implies the weak submodularity of the corresponding subset selection problem, and thereby proves approximation guarantees for the proposed algorithm. We compare the model's performance relative to other approaches on diverse regression and Bayesian optimization tasks, including the challenging but common setting of sparse corruptions of the labels within or close to the function range.

Optimization in deep learning remains poorly understood, even in the simple setting of deterministic (i.e. full-batch) training. A key difficulty is that much of an optimizer's behavior is implicitly determined by complex oscillatory dynamics, referred to as the "edge of stability." The main contribution of this paper is to show that an optimizer's implicit behavior can be explicitly captured by a "central flow:" a differential equation which models the time-averaged optimization trajectory. We show that these flows can empirically predict long-term optimization trajectories of generic neural networks with a high degree of numerical accuracy. By interpreting these flows, we reveal for the first time 1) the precise sense in which RMSProp adapts to the local loss landscape, and 2) an "acceleration via regularization" mechanism, wherein adaptive optimizers implicitly navigate towards low-curvature regions in which they can take larger steps. This mechanism is key to the efficacy of these adaptive optimizers. Overall, we believe that central flows constitute a promising tool for reasoning about optimization in deep learning.

We develop an efficient data-driven and model-free unsupervised learning algorithm for achieving fully passive intelligent reflective surface (IRS)-assisted optimal short/long-term beamforming in wireless communication networks. The proposed algorithm is based on a zeroth-order stochastic gradient ascent methodology, suitable for tackling two-stage stochastic nonconvex optimization problems with continuous uncertainty and unknown (or "black-box") terms present in the objective function, via the utilization of inexact evaluation oracles. We showcase that the algorithm can operate under realistic and general assumptions, and establish its convergence rate close to some stationary point of the associated two-stage (i.e., short/long-term) problem, particularly in cases where the second-stage (i.e., short-term) beamforming problem (e.g., transmit precoding) is solved inexactly using an arbitrary (inexact) oracle. The proposed algorithm is applicable on a wide variety of IRS-assisted optimal beamforming settings, while also being able to operate without (cascaded) channel model assumptions or knowledge of channel statistics, and over arbitrary IRS physical configurations; thus, no active sensing capability at the IRS(s) is needed. Our algorithm is numerically demonstrated to be very effective in a range of experiments pertaining to a well-studied MISO downlink model, including scenarios demanding physical IRS tuning (e.g., directly through varactor capacitances), even in large-scale regimes.

The amount of debris in orbit has increased significantly over the years. With the recent growth of interest in space exploration, conjunction assessment has become a central issue. One important metric to evaluate conjunction risk is the miss distance. However, this metric does not intrinsically take into account uncertainty distributions. Some work has been developed to consider the uncertainty associated with the position of the orbiting objects, in particular, to know if these uncertainty distributions overlap (e.g., ellipsoids when considering Gaussian distributions). With this work, we present fast solutions to not only check if the ellipsoids overlap but to compute the distance between them, which we call margin. We present two fast solution methods for two different paradigms: when the best-known data from both objects can be centralized (e.g., debris-satellite conjunctions) and when the most precise covariances cannot be shared (conjunctions of satellites owned by different operators). Our methods are both accurate and fast, being able to process 15,000 conjunctions per minute with the centralized solution and approximately 490 conjunctions per minute with the distributed solution.

Bilevel optimization problems are characterized by an interactive hierarchical structure, where the upper level seeks to optimize its strategy while simultaneously considering the response of the lower level. Evolutionary algorithms are commonly used to solve complex bilevel problems in practical scenarios, but they face significant resource consumption challenges due to the nested structure imposed by the implicit lower-level optimality condition. This challenge becomes even more pronounced as problem dimensions increase. Although recent methods have enhanced bilevel convergence through task-level knowledge sharing, further efficiency improvements are still hindered by redundant lower-level iterations that consume excessive resources while generating unpromising solutions. To overcome this challenge, this paper proposes an efficient dynamic resource allocation framework for evolutionary bilevel optimization, named DRC-BLEA. Compared to existing approaches, DRC-BLEA introduces a novel competitive quasi-parallel paradigm, in which multiple lower-level optimization tasks, derived from different upper-level individuals, compete for resources. A continuously updated selection probability is used to prioritize execution opportunities to promising tasks. Additionally, a cooperation mechanism is integrated within the competitive framework to further enhance efficiency and prevent premature convergence. Experimental results compared with chosen state-of-the-art algorithms demonstrate the effectiveness of the proposed method. Specifically, DRC-BLEA achieves competitive accuracy across diverse problem sets and real-world scenarios, while significantly reducing the number of function evaluations and overall running time.

Thermal states play a fundamental role in various areas of physics, and they are becoming increasingly important in quantum information science, with applications related to semi-definite programming, quantum Boltzmann machine learning, Hamiltonian learning, and the related task of estimating the parameters of a Hamiltonian. Here we establish formulas underlying the basic geometry of parameterized thermal states, and we delineate quantum algorithms for estimating the values of these formulas. More specifically, we prove formulas for the Fisher--Bures and Kubo--Mori information matrices of parameterized thermal states, and our quantum algorithms for estimating their matrix elements involve a combination of classical sampling, Hamiltonian simulation, and the Hadamard test. These results have applications in developing a natural gradient descent algorithm for quantum Boltzmann machine learning, which takes into account the geometry of thermal states, and in establishing fundamental limitations on the ability to estimate the parameters of a Hamiltonian, when given access to thermal-state samples. For the latter task, and for the special case of estimating a single parameter, we sketch an algorithm that realizes a measurement that is asymptotically optimal for the estimation task. We finally stress that the natural gradient descent algorithm developed here can be used for any machine learning problem that employs the quantum Boltzmann machine ansatz.