Date: Fri, 9 Aug 2024 | Total: 17

Bi-fidelity stochastic optimization is increasingly favored for streamlining optimization processes by employing a cost-effective low-fidelity (LF) function, with the goal of optimizing a more expensive high-fidelity (HF) function. In this paper, we introduce ASTRO-BFDF, a new adaptive sampling trust region method specifically designed for solving unconstrained bi-fidelity stochastic derivative-free optimization problems. Within ASTRO-BFDF, the LF function serves two purposes: first, to identify better iterates for the HF function when a high correlation between them is indicated by the optimization process, and second, to reduce the variance of the HF function estimates by Bi-fidelity Monte Carlo (BFMC). In particular, the sample sizes are dynamically determined with the option of employing either crude Monte Carlo or BFMC, while balancing optimization error and sampling error. We demonstrate that the iterates generated by ASTRO-BFDF converge to the first-order stationary point almost surely. Additionally, we numerically demonstrate the superiority of our proposed algorithm by testing it on synthetic problems and simulation optimization problems with discrete event simulations.

This paper examines nonlinear optimization problems that incorporate discrete decisions. We introduce new improved formulation techniques that take advantage of the simplotope structure present in the domain of the binarization variables. Our technique identifies new polynomially solvable instances for price promotion problem initially studied by Cohen et al. (2021) and allows us to develop a linear programming (LP) formulation for inventory optimization problem under a choice model proposed by Boada-Collado and Martinez-de Albeniz (2020). The techniques we develop rely on ideal formulations for submodular and fractional compositions of discrete functions, improving prior formulations for bilinear products suggested by Adams and Henry (2012). Submodular compositions also generalize L natural functions over bounded domains and our construction provides new insights into Lovasz-extension based formulations for this class of problems while expanding the class of nonlinear discrete optimization problems amenable to linear programming based techniques.

The framework of decision-making, modeled as a Markov Decision Process (MDP), typically assumes a single objective. However, most practical scenarios involve considering tradeoffs between multiple objectives. With that as the motivation, we consider the task of finding the Pareto front of achievable tradeoffs in the context of Linear Quadratic Regulator (LQR), a canonical example of a continuous, infinite horizon MDP. As our first contribution, we establish that the Pareto front for LQR is characterized by linear scalarization, wherein a linear combination of the objectives creates a single objective, and by varying the weight of the linear combination one achieves different possible tradeoffs. That is, each tradeoff point on the Pareto front of multi-objective LQR turns out to be a single objective LQR where the objective is a convex combination of the multiple objectives. Intellectually, our work provides an important example of linear scalarization being sufficient for a non-convex multi-objective problem. As our second contribution, we establish the smoothness of the Pareto front, showing that the optimal control to an $\epsilon$-perturbation to a scalarization parameter yields an $O(\epsilon)$-approximation to its objective performance. Together these results highlight a simple algorithm to approximate the continuous Pareto front by optimizing over a grid of scalarization parameters. Unlike other scalarization methods, each individual optimization problem retains the structure of a single objective LQR problem, making them computationally feasible. Lastly, we extend the results to consider certainty equivalence, where the unknown dynamics are replaced with estimates.

This paper focuses on the contextual optimization problem where a decision is subject to some uncertain parameters and covariates that have some predictive power on those parameters are available before the decision is made. More specifically, we focus on solving the Wasserstein-distance-based distributionally robust optimization (DRO) model for the problem, which maximizes the worst-case expected objective over an uncertainty set including all distributions closed enough to a nominal distribution with respect to the Wasserstein distance. We develop a stochastic gradient descent algorithm based on the idea of data augmentation to solve the model efficiently. The algorithm iteratively a) does a bootstrapping sample from the nominal distribution; b) perturbs the adversarially and c) updates decisions. Accordingly, the computational time of the algorithm is only determined by the number of iterations and the complexity of computing the gradient of a single sample. Except for efficiently solving the model, the algorithm provide additional advantages that the proposed algorithm can cope with any nominal distributions and therefore is extendable to solve the problem in an online setting. We also prove that the algorithm converges to the optimal solution of the DRO model at a rate of a $O(1/\sqrt{T})$, where $T$ is the number of iterations of bootstrapping. Consequently, the performance guarantee of the algorithm is that of the DRO model plus $O(1/\sqrt{T})$. Through extensive numerical experiments, we demonstrate the superior performance of the proposed algorithm to several benchmarks.

In this survey we consider polynomial optimization problems, asking to minimize a polynomial function over a compact semialgebraic set, defined by polynomial inequalities. This models a great variety of (in general, nonlinear nonconvex) optimization problems. Various hierarchies of (lower and upper) bounds have been introduced, having the remarkable property that they converge asymptotically to the global minimum. These bounds exploit algebraic representations of positive polynomials in terms of sums of squares and can be computed using semidefinite optimization. Our focus lies in the performance analysis of these hierarchies of bounds, namely, in how far the bounds are from the global minimum as the degrees of the sums of squares they involve tend to infinity. We present the main state-of-the-art results and offer a gentle introductory overview over the various techniques that have been recently developed to establish them, stemming from the theory of orthogonal polynomials, approximation theory, Fourier analysis, and more.

We consider a moving target that we seek to learn from samples. Our results extend randomized techniques developed in control and optimization for a constant target to the case where the target is changing. We derive a novel bound on the number of samples that are required to construct a probably approximately correct (PAC) estimate of the target. Furthermore, when the moving target is a convex polytope, we provide a constructive method of generating the PAC estimate using a mixed integer linear program (MILP). The proposed method is demonstrated on an application to autonomous emergency braking.

In this paper, we investigate the nonemptiness of weak Pareto efficient solution set for a class of nonsmooth vector optimization problems on a nonempty closed constraint set without any boundedness and convexity assumptions. First, we obtain a new property concerning the nonemptiness of weak Pareto efficient solution sets for these vector optimization problems. Then, under the condition of weak section-boundedness from below, we establish relationships between the notions of properness, Palais-Smale, weak Palais-Smale and M-tameness conditions with respect to some index sets for the restriction of the vector mapping on the constraint set. Finally, we present some new necessary and sufficient conditions for the nonemptiness of the weak Pareto efficient solution set for a general class of nonsmooth vector optimization problems.

In this paper, we consider the social optimal problem of discrete time finite state space mean field games (referred to as finite mean field games [1]). Unlike the individual optimization of their own cost function in competitive models, in the problem we consider, individuals aim to optimize the social cost by finding a fixed point of the state distribution to achieve equilibrium in the mean field game. We provide a sufficient condition for the existence and uniqueness of the individual optimal strategies used to minimize the social cost. According to the definition of social optimum and the derived properties of social optimal cost, the existence and uniqueness conditions of equilibrium solutions under initial-terminal value constraints in the finite horizon and the existence and uniqueness conditions of stable solutions in the infinite horizon are given. Finally, two examples that satisfy the conditions for the above solutions are provided.

This paper studies the dynamics of a network of diffusively-coupled bistable systems. Under mild conditions and without requiring smoothness of the vector field, we analyze the network dynamics and show that the solutions converge globally to the set of equilibria for generic monotone (but not necessarily strictly monotone) regulatory functions. Sufficient conditions for global state synchronization are provided. Finally, by adopting a piecewise linear approximation of the vector field, we determine the existence, location and stability of the equilibria as function of the coupling gain. The theoretical results are illustrated with numerical simulations.

An optimal guidance method is developed that reduces sensitivity to parameters in the dynamic model. The method combines a previously developed method for guidance and control using adaptive Legendre-Gauss-Radau (LGR) collocation and a previously developed approach for desensitized optimal control. Guidance updates are performed such that the desensitized optimal control problem is re-solved on the remaining horizon at the start of each guidance cycle. The effectiveness of the method is demonstrated on a simple example using Monte Carlo simulation. It is found that the method reduces variations in the terminal state as compared to either desensitized optimal control without guidance updates or a previously developed method for optimal guidance and control.

Linear programs with quadratic regularization are attracting renewed interest due to their applications in optimal transport: unlike entropic regularization, the squared-norm penalty gives rise to sparse approximations of optimal transport couplings. It is well known that the solution of a quadratically regularized linear program over any polytope converges stationarily to the minimal-norm solution of the linear program when the regularization parameter tends to zero. However, that result is merely qualitative. Our main result quantifies the convergence by specifying the exact threshold for the regularization parameter, after which the regularized solution also solves the linear program. Moreover, we bound the suboptimality of the regularized solution before the threshold. These results are complemented by a convergence rate for the regime of large regularization. We apply our general results to the setting of optimal transport, where we shed light on how the threshold and suboptimality depend on the number of data points.

We study nonnegative and sums of squares symmetric (and even symmetric) functions of fixed degree. We can think of these as limit cones of symmetric nonnegative polynomials and symmetric sums of squares of fixed degree as the number of variables goes to infinity. We compare these cones, including finding explicit examples of nonnegative polynomials which are not sums of squares for any sufficiently large number of variables, and compute the tropicalizations of their dual cones in the even symmetric case. We find that the tropicalization of the dual cones is naturally understood in terms of the overlooked superdominance order on partitions. The power sum symmetric functions obey this same partial order (analogously to how term-normalized power sums obey the dominance order).

Given a unichain Markov reward process (MRP), we provide an explicit expression for the bias values in terms of mean first passage times. This result implies a generalization of known Markov chain perturbation bounds for the stationary distribution to the case where the perturbed chain is not irreducible. It further yields an improved perturbation bound in 1-norm. As a special case, Kemeny's constant can be interpreted as the translated bias in an MRP with constant reward 1, which offers an intuitive explanation why it is a constant.

In dynamic programming and reinforcement learning, the policy for the sequential decision making of an agent in a stochastic environment is usually determined by expressing the goal as a scalar reward function and seeking a policy that maximizes the expected total reward. However, many goals that humans care about naturally concern multiple aspects of the world, and it may not be obvious how to condense those into a single reward function. Furthermore, maximization suffers from specification gaming, where the obtained policy achieves a high expected total reward in an unintended way, often taking extreme or nonsensical actions. Here we consider finite acyclic Markov Decision Processes with multiple distinct evaluation metrics, which do not necessarily represent quantities that the user wants to be maximized. We assume the task of the agent is to ensure that the vector of expected totals of the evaluation metrics falls into some given convex set, called the aspiration set. Our algorithm guarantees that this task is fulfilled by using simplices to approximate feasibility sets and propagate aspirations forward while ensuring they remain feasible. It has complexity linear in the number of possible state-action-successor triples and polynomial in the number of evaluation metrics. Moreover, the explicitly non-maximizing nature of the chosen policy and goals yields additional degrees of freedom, which can be used to apply heuristic safety criteria to the choice of actions. We discuss several such safety criteria that aim to steer the agent towards more conservative behavior.

In combinatorial optimization, matroids provide one of the most elegant structures for algorithm design. This is perhaps best identified by the Edmonds-Rado theorem relating the success of the simple greedy algorithm to the anatomy of the optimal basis of a matroid [Edm71; Rad57]. As a response, much energy has been devoted to understanding a matroid's favorable computational properties. Yet surprisingly, not much is understood where parallel algorithm design is concerned. Specifically, while prior work has investigated the task of finding an arbitrary basis in parallel computing settings [KUW88], the more complex task of finding the optimal basis remains unexplored. We initiate this study by reexamining Bor\r{u}vka's minimum weight spanning tree algorithm in the language of matroid theory, identifying a new characterization of the optimal basis by way of a matroid's cocircuits as a result. Furthermore, we then combine such insights with special properties of binary matroids to reduce optimization in a binary matroid to the simpler task of search for an arbitrary basis, with only logarithmic asymptotic overhead. Consequentially, we are able to compose our reduction with a known basis search method of [KUW88] to obtain a novel algorithm for finding the optimal basis of a binary matroid with only sublinearly many adaptive rounds of queries to an independence oracle. To the authors' knowledge, this is the first parallel algorithm for matroid optimization to outperform the greedy algorithm in terms of adaptive complexity, for any class of matroid not represented by a graph.

The emergence of learned indexes has caused a paradigm shift in our perception of indexing by considering indexes as predictive models that estimate keys' positions within a data set, resulting in notable improvements in key search efficiency and index size reduction; however, a significant challenge inherent in learned index modeling is its constrained support for update operations, necessitated by the requirement for a fixed distribution of records. Previous studies have proposed various approaches to address this issue with the drawback of high overhead due to multiple model retraining. In this paper, we present UpLIF, an adaptive self-tuning learned index that adjusts the model to accommodate incoming updates, predicts the distribution of updates for performance improvement, and optimizes its index structure using reinforcement learning. We also introduce the concept of balanced model adjustment, which determines the model's inherent properties (i.e. bias and variance), enabling the integration of these factors into the existing index model without the need for retraining with new data. Our comprehensive experiments show that the system surpasses state-of-the-art indexing solutions (both traditional and ML-based), achieving an increase in throughput of up to 3.12 times with 1000 times less memory usage.

In this work, we consider non-collocated sensors and actuators, and we address the problem of minimizing the number of sensor-to-actuator transmissions while ensuring that the L2 gain of the system remains under a threshold. By using causal factorization and system level synthesis, we reformulate this problem as a rank minimization problem over a convex set. When heuristics like nuclear norm minimization are used for rank minimization, the resulting matrix is only numerically low rank and must be truncated, which can lead to an infeasible solution. To address this issue, we introduce approximate causal factorization to control the factorization error and provide a bound on the degradation of the L2 gain in terms of the factorization error. The effectiveness of our method is demonstrated using a benchmark.