2026-04-17 | | Total: 26
Optimized charging of electric vehicles (EVs) at public locations consists of two decisions: how much energy to deliver at what times, which is continuous, and where to plug in, which is binary. This makes optimizing EV charging a mixed-integer linear program (MILP). This discreteness undermines traditional marginal pricing methods. In this paper, we develop the first marginal-price-based mechanism for pricing EV charging with binary station access constraints. Using the result of Burer (2009), we express the EV charging as a completely positive program (CPP), whose dual is a copositive program (COP). This convex dual admits valid shadow prices even though the original allocation problem is discrete and nonconvex. By interpreting the COP dual variables as marginal prices, we construct a pricing mechanism that captures EV supply equipment (EVSE) congestion as well as charging-capacity limits. We prove that the resulting mechanism is revenue-adequate for the operator and individually rational for every EV user, in the strong sense that each user maximizes their own welfare by accepting their assigned charging plan rather than deviating to any alternative option. We further develop problem-specific inner-approximation and dimension-reduction techniques that substantially improve the computational tractability of solving the COP in our setting. Numerical experiments on both small and large scale charging instances demonstrate that our pricing mechanism captures discrete congestion effects and aligns user incentives with the system-optimal assignment, outperforming time-of-use (TOU) and convex relaxation benchmarks.
This paper presents a real-time computational framework for multi-node distributed optimization by extending the Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) algorithm. Our approach integrates adjoint sequential quadratic programming (SQP) techniques to enable efficient approximation of Jacobian information within the ALADIN embedded quadratic program, thereby reducing communication overhead. Furthermore, to decrease computational complexity, we design an event-triggered update strategy that avoids updating Hessian and Jacobian matrices at every iteration. The proposed method achieves local convergence and enhanced communication efficiency, making it well suited for time-critical applications. Numerical experiments demonstrate that our approach achieves competitive performance while exhibiting superior computational efficiency in real-time scenarios, validating its practical applicability for time-sensitive distributed optimization challenges.
In this work, we study a single-machine scheduling problem that aims at minimizing the total cost of a schedule subject to start-time dependent costs. This framework naturally captures scenarios where costs fluctuate throughout the day, such as time-varying energy or labor prices. To model more realistic scenarios, we assume that these costs lie within a budgeted uncertainty set and propose a two-stage robust optimization approach. In a first stage, the order in which activities should be executed is decided. After a cost scenario has been revealed, the starting times for each activity are established, subject to the ordering from the first stage. We demonstrate that the proposed problem is NP-hard and not approximable, implying the complexity of its robust counterpart. Furthermore, we show that already evaluating a first-stage solution is NP-hard when the uncertainty set is discrete. We develop models and solution methods for both continuous and discrete budgeted uncertainty. In computational experiments, we compare these approaches and demonstrate the advantages of including uncertainty beforehand.
The generalized moment problem (GMP) is an infinite dimensional linear problem over the cone of finite nonnegative Borel measures. When a GMP instance involves finitely many polynomial moment constraints, moment/sum-of-squares hierarchies provide a sequence of bounds converging to the optimal value. We consider GMP instances with measures supported over a compact basic semialgebraic set $X$. We study the case when $X$ has nonempty interior, and the case when $X$ is the vanishing set of prescribed polynomials forming a Gröbner basis of the ideal they generate, which we assume is real radical. Under a relative interior assumption, we show attainment of the infinite dimensional dual problem, and attainment of each associated finite dimensional sum-of-squares strengthening. For the latter we present two disjoint proofs. The first is obtained by adapting results regarding the closedness of quadratic modules, and the second builds on Csiszár's work on exponential density constructions to find a strictly feasible measure. Finally, we discuss the special case where $X$ is the product of spheres, and applications of our results to GMP instances arising from tensor optimization and quantum information theory.
We analyze an optimal control problem with pointwise tracking for a fractional semilinear elliptic partial differential equation. The diffusion is characterized by the spectral fractional Laplacian $(-Δ)^s$ with $s \in (1/2,1)$, a range that guarantees the well-posedness of point evaluations of the state. In addition to the nonconvexity of the control problem, the main difficulty is that the adjoint equation is a fractional partial differential equation with a singular right-hand side: a linear combination of Dirac measures. We establish the existence of optimal solutions and derive first-order as well as necessary and sufficient second-order optimality conditions.
In continuous-time portfolio selection for non-concave utility functions, the martingale duality approach is widely adopted in complete markets, while the dynamic programming approach may sometimes lead to singular solutions of the Hamilton-Jacobi-Bellman (HJB) equation. We propose "dynamic Lagrange multipliers" in a non-concave utility framework, bridging two approaches and demonstrating that the Lagrangian multiplier function (in the martingale duality approach) equals the conjugate dual point related to the value function (in dynamic programming), which is exactly its partial derivative with respect to wealth. Moreover, the dynamic multiplier process exhibits homogeneity via the optimal wealth and pricing kernel processes, offering intuitive economic interpretations as a dynamic shadow price of the envelope theorem. Finally, classical optimal results are recovered and numerically validated by non-concave utility examples.
This paper addresses distributed consensus optimization problems with mixed-integer variables, with a specific focus on Boolean variables. We introduce a novel distributed algorithm that extends the Consensus Augmented Lagrangian Alternating Direction Inexact Newton (CALADIN) framework by incorporating specialized techniques for handling Boolean variables without relying on local mixed-integer solvers. Under the mild assumption of Lipschitz continuity of the objective functions, we establish rigorous convergence guarantees for both convex and nonconvex mixed-integer programming problems. Numerical experiments demonstrate that the proposed algorithm achieves competitive performance compared to existing approaches while providing rigorous convergence guarantees.
This paper investigates distributed resource allocation optimization over directed graphs with limited communication bandwidth. We develop a novel distributed algorithm that integrates the centralized Proximal Jacobian Alternating Direction Method of Multipliers (PJ-ADMM) with a finite-level quantized consensus scheme, enabling nodes to cooperatively solve the optimization in a distributed fashion. Under the assumption of convex objective functions, we establish that the proposed algorithm achieves sublinear convergence to a neighborhood of the optimal solution, with the convergence accuracy explicitly bounded by the quantization level. Numerical experiments validate that the algorithm achieves competitive performance compared to existing approaches while exhibiting communication efficiency.
We study state-feedback design for continuous-time LTI systems with a control input and an external input-output pair. Our objective is to determine feedback gains that render the closed-loop system (strictly) passive with respect to the external port while minimizing the standard LQR cost in the disturbance-free case. The resulting constrained optimization problem is intractable due to bilinear matrix inequalities. We analyze the set of passivating gains, showing it is unbounded, possibly nonconvex, path-connected, and contractible. We propose an indirect approach, in which the set of passivating feedback gains is inner-approximated by a compact, convex polytope. A projected gradient flow is employed to compute a gain within this polytope that minimizes the LQR cost. Numerical examples illustrate the effectiveness of the method.
Although Anderson acceleration (AA) is known to speed up fixed-point iterations, it is rarely applied in constrained optimization, in particular sequential quadratic programming (SQP). We show that the local convergence behavior of a general family of (inexact) SQP-type methods can benefit from AA and introduce a simple heuristic to alleviate slower convergence farther from the solution. The method is implemented in the software framework acados. Numerical examples from optimal control illustrate consistent improvements in convergence of different SQP-type methods.
This work introduces a simple and efficient linesearch method for composite minimization that accelerates proximal-gradient iterations with fast Newton-type directions. Our algorithm is based on simple operations and only requires the standard proximal-gradient oracle, similar to PANOC and ZeroFPR, provided that the nonsmooth term is convex. Noteworthy improvements include a cheaper backtracking procedure, in the sense that no additional gradients need to be evaluated, and an enlarged range of permitted stepsizes. Global subsequential convergence and local superlinear convergence are established under conventional assumptions by considering a novel merit function which is less expensive to evaluate than alternatives like the forward-backward envelope. Finally, the proposed approach is validated on model predictive control problems with collision avoidance constraints, as well as on the LIBSVM and CUTEst benchmarks.
While reinforcement learning has been increasingly applied to stochastic control, few studies have systematically examined policy-based methods in queuing environments modeled as a semi-Markov decision process (SMDP). To address this gap, we investigate how policy-based reinforcement learning (RL) algorithms perform when applied to the control of service rates in an M/M/1 queue, a common queuing model for manufacturing, computing, and service systems. The problem is formulated as an SMDP in which decisions occur at each new service, allowing an agent to select different service rates from a finite set of speeds, aiming to minimize an objective function that manages system congestion and energy costs. Three policy-based reinforcement learning algorithms, namely REINFORCE, Actor-Critic (A2C), and Proximal Policy Optimization (PPO), are trained in a simulated environment using two state representations: the instantaneous queue length and an augmented state that includes a one-step queue history. Performance is evaluated in terms of convergence speed, sampling efficiency, policy quality, and pseudo-regret relative to the steady-state optimum.
We propose a sequential quadratic programming (SQP) algorithm for inequality constrained optimization that is robust to the presence of bounded noise in function and derivative evaluations. We cover the case where constraint evaluations contain noise as well as the objective. The proposed algorithm is a line search SQP method with relaxations to deal with noise. We study the effect of noise on the global convergence behavior of the algorithm. We implement the algorithm with noise-aware quasi-Newton updates, and numerically observe that the algorithm can achieve accuracy proportional to the noise level and problem-dependent parameters, as suggested by the theory.
In this paper, we consider nonlinear optimization problems with a stochastic objective function and deterministic equality constraints. We propose an inexact two-stepsize stochastic sequential quadratic programming (SQP) algorithm and analyze its worst-case complexity under mild assumptions. The method utilizes a step decomposition strategy and handles stochastic gradient estimates by assigning different stepsizes to different components of the search direction. We establish the first known $\mathcal{O}(ε_c^{-2})$ worst-case complexity with respect to the infeasibility measure when no constraint qualification is assumed and a worst-case complexity of $\mathcal{O}(ε_c^{-1})$ when LICQ holds, matching the best known result in the literature. In addition, under mild conditions, our method achieves the optimal $\mathcal{O}(ε_L^{-4})$ complexity with respect to the gradient of the Lagrangian regardless of constraint qualifications. Our results provide the first complexity guarantees for the popular Byrd-Omojukun step decomposition strategy and verify its theoretical efficacy. Numerical experiments show that our algorithm has a superior infeasibility convergence performance and a competitive KKT convergence rate compared to the state-of-the-art stochastic SQP method.
This paper presents a tractable tube-based robust data-driven predictive control scheme that uses only a single finite noisy input-state trajectory of an unknown discrete-time linear time-invariant (LTI) system. A simplex constraint is imposed on the Hankel coefficient vector, yielding explicit polyhedral bounds on the prediction mismatch induced by bounded measurement noise. Using certified initial and terminal robust positively invariant (RPI) sets, we derive a tube-tightened formulation whose online optimization problem is a strictly convex quadratic program (QP). The resulting controller guarantees recursive feasibility, robust satisfaction of input and state constraints, and practical input-to-state stability of the closed loop with respect to measurement noise. Numerical examples illustrate the effectiveness, robustness, and closed-loop performance of the proposed method.
This paper investigates continuous-time and discrete-time firing-rate and Hopfield recurrent neural networks (RNNs), with applications in nonlinear control design and implicit deep learning. First, we introduce a nonlinear separation principle that guarantees global exponential stability for the interconnection of a contracting state-feedback controller and a contracting observer, alongside parametric extensions for robustness and equilibrium tracking. Second, we derive sharp linear matrix inequality (LMI) conditions that guarantee the contractivity of both firing rate and Hopfield neural network architectures. We establish structural relationships among these certificates-demonstrating that continuous-time models with monotone non-decreasing activations maximize the admissible weight space, and extend these stability guarantees to interconnected systems and Graph RNNs. Third, we combine our separation principle and LMI framework to solve the output reference tracking problem for RNN-modeled plants. We provide LMI synthesis methods for feedback controllers and observers, and rigorously design a low-gain integral controller to eliminate steady-state error. Finally, we derive an exact, unconstrained algebraic parameterization of our contraction LMIs to design highly expressive implicit neural networks, achieving competitive accuracy and parameter efficiency on standard image classification benchmarks.
Coverage path planning on irregular hexagonal grids is relevant to maritime surveillance, search and rescue and environmental monitoring, yet classical methods are often compared on small ad hoc examples or on rectangular grids. This paper presents a reproducible benchmark of deterministic single-vehicle coverage path planning heuristics on irregular hexagonal graphs derived from synthetic but maritime-motivated areas of interest. The benchmark contains 10,000 Hamiltonian-feasible instances spanning compact, elongated, and irregular morphologies, 17 heuristics from seven families, and a common evaluation protocol covering Hamiltonian success, complete-coverage success, revisits, path length, heading changes, and CPU latency. Across the released dataset, heuristics with explicit shortest-path reconnection solve the relaxed coverage task reliably but almost never produce zero-revisit tours. Exact Depth-First Search confirms that every released instance is Hamiltonian-feasible. The strongest classical Hamiltonian baseline is a Warnsdorff variant that uses an index-based tie-break together with a terminal-inclusive residual-degree policy, reaching 79.0% Hamiltonian success. The dominant design choice is not tie-breaking alone, but how the residual degree is defined when the endpoint is reserved until the final move. This shows that underreported implementation details can materially affect performance on sparse geometric graphs with bottlenecks. The benchmark is intended as a controlled testbed for heuristic analysis rather than as a claim of operational optimality at fleet scale.
We consider deterministic finite-horizon optimal control problems with a fixed initial state. We introduce an on-line policy iteration method, which starting from a given policy, however obtained, generates a sequence of cost improving policies and corresponding trajectories. Each policy produces a trajectory, which is used in turn to generate data for training the next policy. The method is motivated by problems that are repeatedly solved starting from the same initial state, including discrete optimization and path planning for repetitive tasks. For such problems, the method is fast enough to be used on-line. Under a natural consistency condition, we show that the sequence of costs of the generated policies is monotonically improving for the given initial state (but not necessarily for other states). We illustrate our results with computational studies from combinatorial optimization and 3-dimensional path planning for drones in the presence of obstacles. We also discuss briefly a stochastic counterpart of our algorithm. Our proposed framework combines elements of rollout and policy iteration with flexible trajectory-based policy representations, and applies to problems involving a single as well as multiple decision makers. It also provides a principled way to train neural network-based policies using trajectory data, while preserving monotonic cost improvement.
In this paper, we identify the smallest set of control input nodes and an associated output feedback law that achieves complete disturbance decoupling for a class of coupled oscillator networks. The focus is specifically on systems linearized around a stable phase-locked synchronized state. The proposed theoretical framework is applied to the linearized swing dynamics of power grids operating near synchronization. In this context, the disturbance decoupling problem corresponds to isolating subsets of nodes from exogenous disturbances by means of batteries that can both add or withdraw active power. Numerical simulations carried out on the IEEE New England 39-bus system show that the proposed methodology not only yields a minimal actuator placement ensuring effective disturbance rejection, but also preserves the internal stability of the closed-loop system.
This paper analyzes the implications of simplified pipeline gas flow models for integrated energy system planning. A case study of an integrated power-hydrogen expansion planning problem shows that simplifying pressure-flow relationships and gas dynamics can lead to expansion plans that incur substantial regret when evaluated under a more realistic dynamic gas flow model -- due to suboptimal system expansion, operation, and non-supplied hydrogen. Numerical experiments show that planning under the highly simplified transport and transport-linepack models -- commonly used in expansion studies -- can result in regret exceeding several thousand percent and yield expansion plans that lack robustness across demand levels. Planning under steady-state conditions partially mitigates these effects, but still leaves significant cost-reduction potential untapped compared to dynamic planning due to neglected linepack flexibility. Developing efficient solution algorithms for the dynamic model is a promising direction for future research.
We present a geometric framework for Reinforcement Learning (RL) that views policies as maps into the Wasserstein space of action probabilities. First, we define a Riemannian structure induced by stationary distributions, proving its existence in a general context. We then define the tangent space of policies and characterize the geodesics, specifically addressing the measurability of vector fields mapped from the state space to the tangent space of probability measures over the action space. Next, we formulate a general RL optimization problem and construct a gradient flow using Otto's calculus. We compute the gradient and the Hessian of the energy, providing a formal second-order analysis. Finally, we illustrate the method with numerical examples for low-dimensional problems, computing the gradient directly from our theoretical formalism. For high-dimensional problems, we parameterize the policy using a neural network and optimize it based on an ergodic approximation of the cost.
Zeroth-order (ZO) methods are widely used when gradients are unavailable or prohibitively expensive, including black-box learning and memory-efficient fine-tuning of large models, yet their optimization dynamics in deep learning remain underexplored. In this work, we provide an explicit step size condition that exactly captures the (mean-square) linear stability of a family of ZO methods based on the standard two-point estimator. Our characterization reveals a sharp contrast with first-order (FO) methods: whereas FO stability is governed solely by the largest Hessian eigenvalue, mean-square stability of ZO methods depends on the entire Hessian spectrum. Since computing the full Hessian spectrum is infeasible in practical neural network training, we further derive tractable stability bounds that depend only on the largest eigenvalue and the Hessian trace. Empirically, we find that full-batch ZO methods operate at the edge of stability: ZO-GD, ZO-GDM, and ZO-Adam consistently stabilize near the predicted stability boundary across a range of deep learning training problems. Our results highlight an implicit regularization effect specific to ZO methods, where large step sizes primarily regularize the Hessian trace, whereas in FO methods they regularize the top eigenvalue.
Lion optimizer is a popular learning-based optimization algorithm in machine learning, which shows impressive performance in training many deep learning models. Although convergence property of the Lion optimizer has been studied, its generalization analysis is still missing. To fill this gap, we study generalization property of the Lion via algorithmic stability based on the mathematical induction. Specifically, we prove that the Lion has a generalization error of $O(\frac{1}{Nτ^T})$, where $N$ is training sample size, and $τ>0$ denotes the smallest absolute value of non-zero element in gradient estimator, and $T$ is the total iteration number. In addition, we obtain an interesting byproduct that the SignSGD algorithm has the same generalization error as the Lion. To enhance generalization of the Lion, we design a novel efficient Cautious Lion (i.e., CLion) optimizer by cautiously using sign function. Moreover, we prove that our CLion has a lower generalization error of $O(\frac{1}{N})$ than $O(\frac{1}{Nτ^T})$ of the Lion, since the parameter $τ$ generally is very small. Meanwhile, we study convergence property of our CLion optimizer, and prove that our CLion has a fast convergence rate of $O(\frac{\sqrt{d}}{T^{1/4}})$ under $\ell_1$-norm of gradient for nonconvex stochastic optimization, where $d$ denotes the model dimension. Extensive numerical experiments demonstrate effectiveness of our CLion optimizer.
Behavior cloning (BC) policies on position-controlled robots inherit the closed-loop response of the underlying PD controller, yet the effect of controller gains on BC failure lacks a nonasymptotic theory. We show that independent sub-Gaussian action errors propagate through the gain-dependent closed-loop dynamics to yield sub-Gaussian position errors whose proxy matrix $X_\infty(K)$ governs the failure tail. The probability of horizon-$T$ task failure factorizes into a gain-dependent amplification index $Γ_T(K)$ and the validation loss plus a generalization slack, so training loss alone cannot predict closed-loop performance. Under shape-preserving upper-bound structural assumptions the proxy admits the scalar bound $X_\infty(K)\preceqΨ(K)\bar X$ with $Ψ(K)$ decomposed into label difficulty, injection strength, and contraction, ranking the four canonical regimes with compliant-overdamped (CO) tightest, stiff-underdamped (SU) loosest, and the stiff-overdamped versus compliant-underdamped ordering system-dependent. For the canonical scalar second-order PD system the closed-form continuous-time stationary variance $X_\infty^{\mathrm{c}}(α,β)=σ^2α/(2β)$ is strictly monotone in stiffness and damping over the entire stable orthant, covering both underdamped and overdamped regimes, and the exact zero-order-hold (ZOH) discretization inherits this monotonicity. The analysis provides the first nonasymptotic explanation of the empirical finding that compliant, overdamped controllers improve BC success rates.
We formulate a method to co-optimize power system capacity planning decisions and policy investments that shape electricity load patterns. To this end, we leverage a gradient-based solution technique that enables the efficient solution of operation-aware planning models. To compute gradients with respect to the conditions that define daily electricity demand profiles, we introduce and formalize the concept of differentiable scenario generation and show that generative machine learning models satisfy the mathematical requirements needed to compute consistent gradients. We demonstrate the feasibility of the proposed approach through numerical experiments using a diffusion model-based scenario generator and a stylized generation and capacity expansion planning model.