2026-02-11 | | Total: 25
Efficient exploration remains a central challenge in reinforcement learning (RL), particularly in sparse-reward environments. We introduce Optimistic World Models (OWMs), a principled and scalable framework for optimistic exploration that brings classical reward-biased maximum likelihood estimation (RBMLE) from adaptive control into deep RL. In contrast to upper confidence bound (UCB)-style exploration methods, OWMs incorporate optimism directly into model learning by augmentation with an optimistic dynamics loss that biases imagined transitions toward higher-reward outcomes. This fully gradient-based loss requires neither uncertainty estimates nor constrained optimization. Our approach is plug-and-play with existing world model frameworks, preserving scalability while requiring only minimal modifications to standard training procedures. We instantiate OWMs within two state-of-the-art world model architectures, leading to Optimistic DreamerV3 and Optimistic STORM, which demonstrate significant improvements in sample efficiency and cumulative return compared to their baseline counterparts.
This work investigates the impact of position and attitude perturbations on the beamforming performance of multi-satellite systems. The system under analysis is a formation of small satellites equipped with direct radiating arrays that synthesise a large virtual antenna aperture. The results show that performance is highly sensitive to the considered perturbations. However, by incorporating position and attitude information into the beamforming process, nominal performance can be effectively restored. These findings support the development of control-aware beamforming strategies that tightly integrate the attitude and orbit control system with signal processing to enable robust beamforming and autonomous coordination.
The rapid growth of data centers has made large electronic load (LEL) modeling increasingly important for power system analysis. Such loads are characterized by fast workload-driven variability and protection-driven disconnection and reconnection behavior that are not captured by conventional load models. Existing data center load modeling includes physics-based approaches, which provide interpretable structure for grid simulation, and data-driven approaches, which capture empirical workload variability from data. However, physics-based models are typically uncalibrated to facility-level operation, while trajectory alignment in data-driven methods often leads to overfitting and unrealistic dynamic behavior. To resolve these limitations, we design the framework to leverage both physics-based structure and data-driven adaptability. The physics-based structure is parameterized to enable data-driven pattern-consistent calibration from real operational data, supporting facility-level grid planning. We further show that trajectory-level alignment is limited for inherently stochastic data center loads. Therefore, we design the calibration to align temporal and statistical patterns using temporal contrastive learning (TCL). This calibration is performed locally at the facility, and only calibrated parameters are shared with utilities, preserving data privacy. The proposed load model is calibrated by real-world operational load data from the MIT Supercloud, ASU Sol, Blue Waters, and ASHRAE datasets. Then it is integrated into the ANDES platform and evaluated on the IEEE 39-bus, NPCC 140-bus, and WECC 179-bus systems. We find that interactions among LELs can fundamentally alter post-disturbance recovery behavior, producing compound disconnection-reconnection dynamics and delayed stabilization that are not captured by uncalibrated load models.
Legged robot research is presently focused on bipedal or quadrupedal robots, despite capabilities to build robots with many more legs to potentially improve locomotion performance. This imbalance is not necessarily due to hardware limitations, but rather to the absence of principled control frameworks that explain when and how additional legs improve locomotion performance. In multi-legged systems, coordinating many simultaneous contacts introduces a severe curse of dimensionality that challenges existing modeling and control approaches. As an alternative, multi-legged robots are typically controlled using low-dimensional gaits originally developed for bipeds or quadrupeds. These strategies fail to exploit the new symmetries and control opportunities that emerge in higher-dimensional systems. In this work, we develop a principled framework for discovering new control structures in multi-legged locomotion. We use geometric mechanics to reduce contact-rich locomotion planning to a graph optimization problem, and propose a spin model duality framework from statistical mechanics to exploit symmetry breaking and guide optimal gait reorganization. Using this approach, we identify an asymmetric locomotion strategy for a hexapod robot that achieves a forward speed of 0.61 body lengths per cycle (a 50% improvement over conventional gaits). The resulting asymmetry appears at both the control and hardware levels. At the control level, the body orientation oscillates asymmetrically between fast clockwise and slow counterclockwise turning phases for forward locomotion. At the hardware level, two legs on the same side remain unactuated and can be replaced with rigid parts without degrading performance. Numerical simulations and robophysical experiments validate the framework and reveal novel locomotion behaviors that emerge from symmetry reforming in high-dimensional embodied systems.
The accelerating militarization of artificial intelligence has transformed the ethics, politics, and governance of warfare. This article interrogates how AI-driven targeting systems function as epistemic infrastructures that classify, legitimize, and execute violence, using Israel's conduct in Gaza as a paradigmatic case. Through the lens of responsibility, the article examines three interrelated dimensions: (a) political responsibility, exploring how states exploit AI to accelerate warfare while evading accountability; (b) professional responsibility, addressing the complicity of technologists, engineers, and defense contractors in the weaponization of data; and (c) personal responsibility, probing the moral agency of individuals who participate in or resist algorithmic governance. This is complemented by an examination of the position and influence of those participating in public discourse, whose narratives often obscure or normalize AI-enabled violence. The Gaza case reveals AI not as a neutral instrument but as an active participant in the reproduction of colonial hierarchies and the normalization of atrocity. Ultimately, the paper calls for a reframing of technological agency and accountability in the age of automated warfare. It concludes that confronting algorithmic violence demands a democratization of AI ethics, one that resists technocratic fatalism and centers the lived realities of those most affected by high-tech militarism.
L7 load balancers are a fundamental building block in microservices as they enable fine-grained traffic distribution. Compared to monolithic applications, microservices demand higher performance and stricter isolation from load balancers. This is due to the increased number of instances, longer service chains, and the necessity for co-location with services on the same host. Traditional sidecar-based load balancers are ill-equipped to meet these demands, often resulting in significant performance degradation. In this work, we present XLB, a novel architecture that reshapes L7 load balancers as in-kernel interposition operating on the socket layer. We leverage eBPF to implement the core load balancing logic in the kernel, and address the connection management and state maintenance challenges through novel socket layer redirection and nested eBPF maps designs. XLB eliminates the extra overhead of scheduling, communication, and data movement, resulting in a more lightweight, scalable, and efficient L7 load balancer architecture. Compared to the widely used microservices load balancers (Istio and Cilium), over 50 microservice instances, XLB achieves up to 1.5x higher throughput and 60% lower end-to-end latency.
The interconnection of 5G and non-terrestrial networks (NTNs) has been actively studied to expand connectivity beyond conventional terrestrial infrastructure. In the 3GPP standardization of 5G systems, the 5G Quality of Service (QoS) Identifier (5QI) is defined to characterize the QoS requirements of different traffic requirements. However, it falls short in capturing the diverse latency, capacity, and reliability profiles of NTN environments, particularly when NTNs are used as backhaul. Furthermore, it is difficult to manage individual traffic flows and perform efficient resource allocation and routing when a large number of 5G traffic flows are present in NTN systems. To address these challenges, we propose an optimization framework that enhances QoS handling by introducing an NTN QoS Identifier (NQI) and grouping 5G traffic into NTN slices based on similar requirements. This enables unified resource control and routing for a large number of 5G flows in NTN systems. In this paper, we present the detailed procedure of the proposed framework, which consists of 5QI to NQI mapping, NTN traffic to NTN slice mapping, and slice-level flow and routing optimization. We evaluate the framework by comparing multiple mapping schemes through numerical simulations and analyze their impact on overall network performance.
The agricultural industry has been searching for a high-speed successor to the 250~kbit/s CAN bus backbone of ISO~11783 (ISOBUS) for over a decade, yet no protocol-level solution has reached standardization. Meanwhile, modern planters, sprayers, and Virtual Terminals are already constrained by the bus bandwidth. This paper presents ISO FastLane, a gateway-less dual-stack approach that routes point-to-point ISOBUS traffic over Ethernet while keeping broadcast messages on the existing CAN bus. The solution requires no new state machines, no middleware, and no changes to application layer code: only a simple Layer~3 routing decision and a lightweight peer discovery mechanism called Augmented Address Claim (AACL). Legacy devices continue to operate unmodified and unaware of FastLane traffic. Preliminary tests reported on the paper demonstrate that ISO FastLane accelerates Virtual Terminal object pool uploads by factor of 8 and sustains Task Controller message rates over 100 times beyond the current specification limit. Because ISO FastLane builds entirely on existing J1939 and ISO~11783 conventions, it can be implemented by ISOBUS engineers in a matter of weeks. This is delivering tangible performance gains today, without waiting for the long-term High Speed ISOBUS solution.
Deterministic frequency deviations (DFDs) are systematic and predictable excursions of grid frequency that arise from synchronized generation ramps induced by electricity market scheduling. In this paper, we analyze the impact of the European day-ahead market reform of 1 October 2025, which replaced hourly trading blocks with quarter-hourly blocks, on DFDs in the Central European synchronous area. Using publicly available frequency measurements, we compare periods before and after the reform based on daily frequency profiles, indicators characterizing frequency deviations, principal component analysis, Fourier-based functional data analysis, and power spectral density analysis. We show that the reform substantially reduces characteristic hourly frequency deviations and suppresses dominant spectral components at hourly and half-hourly time scales, while quarter-hourly structures gain relative importance. While the likelihood of large frequency deviations decreases overall, reductions for extreme events are less clear and depend on the metric used. Our results demonstrate that market design reforms can effectively mitigate systematic frequency deviations, but also highlight that complementary technical and regulatory measures are required to further reduce large frequency excursions in low-inertia power systems.
The transition toward low-inertia power systems demands modeling frameworks that provide not only accurate state predictions but also physically consistent sensitivities for control. While scientific machine learning offers powerful nonlinear modeling tools, the control-oriented implications of different differentiable paradigms remain insufficiently understood. This paper presents a comparative study of Physics-Informed Neural Networks (PINNs), Neural Ordinary Differential Equations (NODEs), and Differentiable Programming (DP) for modeling, identification, and control of power system dynamics. Using the Single Machine Infinite Bus (SMIB) system as a benchmark, we evaluate their performance in trajectory extrapolation, parameter estimation, and Linear Quadratic Regulator (LQR) synthesis. Our results highlight a fundamental trade-off between data-driven flexibility and physical structure. NODE exhibits superior extrapolation by capturing the underlying vector field, whereas PINN shows limited generalization due to its reliance on a time-dependent solution map. In the inverse problem of parameter identification, while both DP and PINN successfully recover the unknown parameters, DP achieves significantly faster convergence by enforcing governing equations as hard constraints. Most importantly, for control synthesis, the DP framework yields closed-loop stability comparable to the theoretical optimum. Furthermore, we demonstrate that NODE serves as a viable data-driven surrogate when governing equations are unavailable.
Feedback optimization refers to a class of methods that steer a control system to a steady state that solves an optimization problem. Despite tremendous progress on the topic, an important problem remains open: enforcing state constraints at all times. The difficulty in addressing it lies on mediating between the safety enforcement and the closed-loop stability, and ensuring the equivalence between closed-loop equilibria and the optimization problem's critical points. In this work, we present a feedback-optimization method that enforces state constraints at all times employing high-order control-barrier functions. We provide several results on the proposed controller dynamics, including well-posedness, safety guarantees, equivalence between equilibria and critical points, and local and global (in certain convex cases) asymptotic stability of optima. Various simulations illustrate our results.
Lane changing in dense traffic is a significant challenge for Connected and Autonomous Vehicles (CAVs). Existing lane change controllers primarily either ensure safety or collaboratively improve traffic efficiency, but do not consider these conflicting objectives together. To address this, we propose the Multi-Agent Safety Shield (MASS), designed using Control Barrier Functions (CBFs) to enable safe and collaborative lane changes. The MASS enables collaboration by capturing multi-agent interactions among CAVs through interaction topologies constructed as a graph using a simple algorithm. Further, a state-of-the-art Multi-Agent Reinforcement Learning (MARL) lane change controller is extended by integrating MASS to ensure safety and defining a customised reward function to prioritise efficiency improvements. As a result, we propose a lane change controller, known as MARL-MASS, and evaluate it in a congested on-ramp merging simulation. The results demonstrate that MASS enables collaborative lane changes with safety guarantees by strictly respecting the safety constraints. Moreover, the proposed custom reward function improves the stability of MARL policies trained with a safety shield. Overall, by encouraging the exploration of a collaborative lane change policy while respecting safety constraints, MARL-MASS effectively balances the trade-off between ensuring safety and improving traffic efficiency in congested traffic. The code for MARL-MASS is available with an open-source licence at https://github.com/hkbharath/MARL-MASS
Electric vehicles (EVs) are increasingly deployed, yet range limitations remain a key barrier. Improving energy efficiency via advanced control is therefore essential, and emerging vehicle automation offers a promising avenue. However, many existing strategies rely on indirect surrogates because linking power consumption to control inputs is difficult. We propose a neural-network (NN) identifier that learns this mapping online and couples it with an actor-critic reinforcement learning (RL) framework to generate optimal control commands. The resulting actor-critic-identifier architecture removes dependence on explicit models relating total power, recovered energy, and inputs, while maintaining accurate speed tracking and maximizing efficiency. Update laws are derived using Lyapunov stability analysis, and performance is validated in simulation. Compared to a traditional controller, the method increases total energy recovery by 12.84%, indicating strong potential for improving EV energy efficiency.
Gyroscopic interconnections enable redistribution of energy among degrees of freedom while preserving passivity and total energy, and they play a central role in controlled Lagrangian methods and IDA-PBC. Yet their quantitative effect on transient energy exchange and subsystem performance is not well characterised. We study a conservative mechanical system with constant skew-symmetric velocity coupling. Its dynamics are integrable and evolve on invariant two-tori, whose projections onto subsystem phase planes provide geometric description of energy exchange. When the ratio of normal-mode frequencies is rational, these projections become closed resonant Lissajous curves, enabling structured analysis of subsystem trajectories. To quantify subsystem behaviour, we introduce the inscribed-radius metric: the radius of the largest origin-centred circle contained in a projected trajectory. This gives a lower bound on attainable subsystem energy and acts as an internal performance measure. We derive resonance conditions and develop an efficient method to compute or certify the inscribed radius without time-domain simulation. Our results show that low-order resonances can strongly restrict energy depletion through phase-locking, whereas high-order resonances recover conservative bounds. These insights lead to an explicit interconnection-shaping design framework for both energy absorption and containment control strategies, while taking responsiveness into account.
Ensuring small-signal stability in power systems with a high share of inverter-based resources (IBRs) is hampered by two factors: (i) device and network parameters are often uncertain or completely unknown, and (ii) brute-force enumeration of all topologies is computationally intractable. These challenges motivate plug-and-play (PnP) certificates that verify stability locally yet hold globally. Passivity is an attractive property because it guarantees stability under feedback and network interconnections; however, strict passivity rarely holds for practical controllers such as Grid Forming Inverters (GFMs) employing P-Q droop. This paper extends the passivity condition by constructing a dynamic, frequency-dependent multiplier that enables PnP stability certification of each component based solely on its admittance, without requiring any modification to the controller design. The multiplier is parameterised as a linear filter whose coefficients are tuned under a passivity goal. Numerical results for practical droop gains confirm the PnP rules, substantially enlarging the certified stability region while preserving the decentralised, model-agnostic nature of passivity-based PnP tests.
With over 3.5 million 5G base stations deployed globally, their collective energy consumption (projected to exceed 131 TWh annually) raises significant concerns over both operational costs and environmental impacts. In this paper, we present EExAPP, a deep reinforcement learning (DRL)-based xApp for 5G Open Radio Access Network (O-RAN) that jointly optimizes radio unit (RU) sleep scheduling and distributed unit (DU) resource slicing. EExAPP uses a dual-actor-dual-critic Proximal Policy Optimization (PPO) architecture, with dedicated actor-critic pairs targeting energy efficiency and quality-of-service (QoS) compliance. A transformer-based encoder enables scalable handling of variable user equipment (UE) populations by encoding all-UE observations into fixed-dimensional representations. To coordinate the two optimization objectives, a bipartite Graph Attention Network (GAT) is used to modulate actor updates based on both critic outputs, enabling adaptive tradeoffs between power savings and QoS. We have implemented EExAPP and deployed it on a real-world 5G O-RAN testbed with live traffic, commercial RU and smartphones. Extensive over-the-air experiments and ablation studies confirm that EExAPP significantly outperforms existing methods in reducing the energy consumption of RU while maintaining QoS.
Non-contact current monitoring has emerged as a prominent research focus owing to its non-intrusive characteristics and low maintenance requirements. However, while they offer high sensitivity, contactless sensors necessitate sophisticated design methodologies and thorough experimental validation. In this study, a Giant Magneto-Resistance (GMR) sensor is employed to monitor the instantaneous currents of a three-phase 400-volt overhead line, and its performance is evaluated against that of a conventional contact-based Hall effect sensor. A mathematical framework is developed to calculate current from the measured magnetic field signals. Furthermore, a MATLAB-based dashboard is implemented to enable real-time visualization of current measurements from both sensors under linear and non-linear load conditions. The GMR current sensor achieved a relative accuracy of 64.64% to 91.49%, with most phases above 80%. Identified improvements over this are possible, indicating that the sensing method has potential as a basis for calculating phase currents.
Simulated Annealing (SA) is a widely used stochastic optimization algorithm, yet much of its theoretical understanding is limited to asymptotic convergence guarantees or general spectral bounds. In this paper, we develop a finite-time analytical framework for constant-temperature SA by studying a piecewise linear cost function that permits exact characterization. We model SA as a discrete-state Markov chain and first derive a closed-form expression for the expected time to escape a single linear basin in a one-dimensional landscape. We show that this expression also accurately predicts the behavior of continuous-state searches up to a constant scaling factor, which we analyze empirically and explain via variance matching, demonstrating convergence to a factor of sqrt(3) in certain regimes. We then extend the analysis to a two-basin landscape containing a local and a global optimum, obtaining exact expressions for the expected time to reach the global optimum starting from the local optimum, as a function of basin geometry, neighborhood radius, and temperature. Finally, we demonstrate how the predicted basin escape time can be used to guide the design of a simple two-temperature switching strategy.
This work presents a finite-time stable pose estimator (FTS-PE) for rigid bodies undergoing rotational and translational motion in three dimensions, using measurements from onboard sensors that provide position vectors to inertially-fixed points and body velocities. The FTS-PE is a full-state observer for the pose (position and orientation) and velocities and is obtained through a Lyapunov analysis that shows its stability in finite time and its robustness to bounded measurement noise. Further, this observer is designed directly on the state space, the tangent bundle of the Lie group of rigid body motions, SE(3), without using local coordinates or (dual) quaternion representations. Therefore, it can estimate arbitrary rigid body motions without encountering singularities or the unwinding phenomenon and be readily applied to autonomous vehicles. A version of this observer that does not need translational velocity measurements and uses only point clouds and angular velocity measurements from rate gyros, is also obtained. It is discretized using the framework of geometric mechanics for numerical and experimental implementations. The numerical simulations compare the FTS-PE with a dual-quaternion extended Kalman filter and our previously developed variational pose estimator (VPE). The experimental results are obtained using point cloud images and rate gyro measurements obtained from a Zed 2i stereo depth camera sensor. These results validate the stability and robustness of the FTS-PE.
The accurate characterization of tire dynamics is critical for advancing control strategies in autonomous road vehicles, as tire behavior significantly influences handling and stability through the generation of forces and moments at the tire-road interface. Smart tire technologies have emerged as a promising tool for sensing key variables such as road friction, tire pressure, and wear states, and for estimating kinematic and dynamic states like vehicle speed and tire forces. However, most existing estimation and control algorithms rely on empirical correlations or machine learning approaches, which require extensive calibration and can be sensitive to variations in operating conditions. In contrast, model-based techniques, which leverage infinite-dimensional representations of tire dynamics using partial differential equations (PDEs), offer a more robust approach. This paper proposes a novel model-based, output-feedback lateral tracking control strategy for all-wheel steering vehicles that integrates distributed tire dynamics with smart tire technologies. The primary contributions include the suppression of micro-shimmy phenomena at low speeds and path-following via force control, achieved through the estimation of tire slip angles, vehicle kinematics, and lateral tire forces. The proposed controller and observer are based on formulations using ODE-PDE systems, representing rigid body dynamics and distributed tire behavior. This work marks the first rigorous control strategy for vehicular systems equipped with distributed tire representations in conjunction with smart tire technologies.
Dynamic models, particularly rate-dependent models, have proven effective in capturing the key phenomenological features of frictional processes, whilst also possessing important mathematical properties that facilitate the design of control and estimation algorithms. However, many rate-dependent formulations are built on empirical considerations, whereas physical derivations may offer greater interpretability. In this context, starting from fundamental physical principles, this paper introduces a novel class of first-order dynamic friction models that approximate the dynamics of a bristle element by inverting the friction characteristic. Amongst the developed models, a specific formulation closely resembling the LuGre model is derived using a simple rheological equation for the bristle element. This model is rigorously analyzed in terms of stability and passivity -- important properties that support the synthesis of observers and controllers. Furthermore, a distributed version, formulated as a hyperbolic partial differential equation (PDE), is presented, which enables the modeling of frictional processes commonly encountered in rolling contact phenomena. The tribological behavior of the proposed description is evaluated through classical experiments and validated against the response predicted by the LuGre model, revealing both notable similarities and key differences.
Unmanned Aerial Vehicles (UAVs) are crucial for advancing railway communication by offering reliable connectivity, adaptive coverage, and mobile edge services . This survey examines UAV-assisted approaches for 6G railway needs including ultra-reliable low-latency communication (URLLC) and integrated sensing and communication (ISAC). We cover railway channel models, reconfigurable intelligent surfaces (RIS), and UAV-assisted mobile edge computing (MEC). Key challenges include coexistence with existing systems, handover management, Doppler effect, and security. The roadmap suggests work on integrated communication-control systems and AI-driven optimization for intelligent railway networks.
The Teacher Assignment Problem is a combinatorial optimization problem that involves assigning teachers to courses while guaranteeing that all courses are covered, teachers do not teach too few or too many hours, teachers do not switch assigned courses too often and possibly teach the courses they favor. Typically the problem is solved manually, a task that requires several hours every year. In this work we present a mathematical formulation for the problem and an experimental evaluation of the model implemented using state-of-the-art SMT, CP, and MILP solvers. The implementations are tested over a real-world dataset provided by the Division of Systems and Control at Chalmers University of Technology, and produce teacher assignments with smaller workload deviation, a more even workload distribution among the teachers, and a lower number of switched courses.
Urban energy systems face increasing challenges due to high penetration of renewable energy sources, extreme weather events, and other high-impact, low-probability disruptions. This project proposes a community-centered, open-access framework to enhance the resilience and reliability of urban power and gas networks by integrating microgrid partitioning, mobile energy storage deployment, and data-driven risk assessment. The approach involves converting passive distribution networks into active, self-healing microgrids using distributed energy resources and remotely controlled switches to enable flexible reconfiguration during normal and emergency operations. To address uncertainties from intermittent renewable generation and variable load, an adjustable interval optimization method combined with a column and constraint generation algorithm is developed, providing robust planning solutions without requiring probabilistic information. Additionally, a real-time online risk assessment tool is proposed, leveraging 25 multi-dimensional indices including load, grid status, resilient resources, emergency response, and meteorological factors to support operational decision-making during extreme events. The framework also optimizes the long-term sizing and allocation of mobile energy storage units while incorporating urban traffic data for effective routing during emergencies. Finally, a novel time-dependent resilience and reliability index is introduced to quantify system performance under diverse operating conditions. The proposed methodology aims to enable resilient, efficient, and adaptable urban energy networks capable of withstanding high-impact disruptions while maximizing operational and economic benefits.
Modern applications, such as orchestrating the collective behavior of robotic swarms or traffic flows, require the coordination of large groups of agents evolving in unstructured environments, where disturbances and unmodeled dynamics are unavoidable. In this work, we develop a scalable macroscopic density control framework in which a feedback law is designed directly at the level of an advection--diffusion partial differential equation. We formulate the control problem in the density space and prove global exponential convergence towards the desired behavior in $\mathcal{L}^2$ with guaranteed asymptotic rejection of bounded unknown drift terms, explicitly accounting for heterogeneous agent dynamics, unmodeled behaviors, and environmental perturbations. Our theoretical findings are corroborated by numerical experiments spanning heterogeneous oscillators, traffic systems, and swarm robotics in partially unknown environments.