Systems and Control | Cool Papers - Immersive Paper Discovery

#1 Guidance Design for Escape Flight Vehicle Using Evolution Strategy Enhanced Deep Reinforcement Learning [PDF¹] [Copy] [Kimi¹]

Authors: Xiao Hu ; Tianshu Wang ; Min Gong ; Shaoshi Yang

Guidance commands of flight vehicles are a series of data sets with fixed time intervals, thus guidance design constitutes a sequential decision problem and satisfies the basic conditions for using deep reinforcement learning (DRL). In this paper, we consider the scenario where the escape flight vehicle (EFV) generates guidance commands based on DRL and the pursuit flight vehicle (PFV) generates guidance commands based on the proportional navigation method. For the EFV, the objective of the guidance design entails progressively maximizing the residual velocity, subject to the constraint imposed by the given evasion distance. Thus an irregular dynamic max-min problem of extremely large-scale is formulated, where the time instant when the optimal solution can be attained is uncertain and the optimum solution depends on all the intermediate guidance commands generated before. For solving this problem, a two-step strategy is conceived. In the first step, we use the proximal policy optimization (PPO) algorithm to generate the guidance commands of the EFV. The results obtained by PPO in the global search space are coarse, despite the fact that the reward function, the neural network parameters and the learning rate are designed elaborately. Therefore, in the second step, we propose to invoke the evolution strategy (ES) based algorithm, which uses the result of PPO as the initial value, to further improve the quality of the solution by searching in the local space. Simulation results demonstrate that the proposed guidance design method based on the PPO algorithm is capable of achieving a residual velocity of 67.24 m/s, higher than the residual velocities achieved by the benchmark soft actor-critic and deep deterministic policy gradient algorithms. Furthermore, the proposed ES-enhanced PPO algorithm outperforms the PPO algorithm by 2.7\%, achieving a residual velocity of 69.04 m/s.

#2 Distributed Invariant Kalman Filter for Cooperative Localization using Matrix Lie Groups [PDF¹] [Copy] [Kimi¹]

Authors: Yizhi Zhou ; Yufan Liu ; Pengxiang Zhu ; Xuan Wang

This paper studies the problem of Cooperative Localization (CL) for multi-robot systems, where a group of mobile robots jointly localize themselves by using measurements from onboard sensors and shared information from other robots. We propose a novel distributed invariant Kalman Filter (DInEKF) based on the Lie group theory, to solve the CL problem in a 3-D environment. Unlike the standard EKF which computes the Jacobians based on the linearization at the state estimate, DInEKF defines the robots' motion model on matrix Lie groups and offers the advantage of state estimate-independent Jacobians. This significantly improves the consistency of the estimator. Moreover, the proposed algorithm is fully distributed, relying solely on each robot's ego-motion measurements and information received from its one-hop communication neighbors. The effectiveness of the proposed algorithm is validated in both Monte-Carlo simulations and real-world experiments. The results show that the proposed DInEKF outperforms the standard distributed EKF in terms of both accuracy and consistency.

#3 Imitation Learning for Adaptive Video Streaming with Future Adversarial Information Bottleneck Principle [PDF] [Copy] [Kimi]

Authors: Shuoyao Wang ; Jiawei Lin ; Fangwei Ye

Adaptive video streaming plays a crucial role in ensuring high-quality video streaming services. Despite extensive research efforts devoted to Adaptive BitRate (ABR) techniques, the current reinforcement learning (RL)-based ABR algorithms may benefit the average Quality of Experience (QoE) but suffers from fluctuating performance in individual video sessions. In this paper, we present a novel approach that combines imitation learning with the information bottleneck technique, to learn from the complex offline optimal scenario rather than inefficient exploration. In particular, we leverage the deterministic offline bitrate optimization problem with the future throughput realization as the expert and formulate it as a mixed-integer non-linear programming (MINLP) problem. To enable large-scale training for improved performance, we propose an alternative optimization algorithm that efficiently solves the MINLP problem. To address the issues of overfitting due to the future information leakage in MINLP, we incorporate an adversarial information bottleneck framework. By compressing the video streaming state into a latent space, we retain only action-relevant information. Additionally, we introduce a future adversarial term to mitigate the influence of future information leakage, where Model Prediction Control (MPC) policy without any future information is employed as the adverse expert. Experimental results demonstrate the effectiveness of our proposed approach in significantly enhancing the quality of adaptive video streaming, providing a 7.30\% average QoE improvement and a 30.01\% average ranking reduction.

#4 CityLearn v2: Energy-flexible, resilient, occupant-centric, and carbon-aware management of grid-interactive communities [PDF] [Copy] [Kimi]

Authors: Kingsley Nweye ; Kathryn Kaspar ; Giacomo Buscemi ; Tiago Fonseca ; Giuseppe Pinto ; Dipanjan Ghose ; Satvik Duddukuru ; Pavani Pratapa ; Han Li ; Javad Mohammadi ; Luis Lino Ferreira ; Tianzhen Hong ; Mohamed Ouf ; Alfonso Capozzoli ; Zoltan Nagy

As more distributed energy resources become part of the demand-side infrastructure, it is important to quantify the energy flexibility they provide on a community scale, particularly to understand the impact of geographic, climatic, and occupant behavioral differences on their effectiveness, as well as identify the best control strategies to accelerate their real-world adoption. CityLearn provides an environment for benchmarking simple and advanced distributed energy resource control algorithms including rule-based, model-predictive, and reinforcement learning control. CityLearn v2 presented here extends CityLearn v1 by providing a simulation environment that leverages the End-Use Load Profiles for the U.S. Building Stock dataset to create virtual grid-interactive communities for resilient, multi-agent distributed energy resources and objective control with dynamic occupant feedback. This work details the v2 environment design and provides application examples that utilize reinforcement learning to manage battery energy storage system charging/discharging cycles, vehicle-to-grid control, and thermal comfort during heat pump power modulation.

#5 Characterizing Regional Importance in Cities with Human Mobility Motifs in Metro Networks [PDF] [Copy] [Kimi]

Authors: Shuyang Shi ; Ding Lyu ; Lin Wang ; Xiaofan Wang ; Guanrong Chen

Uncovering higher-order spatiotemporal dependencies within human mobility networks offers valuable insights into the analysis of urban structures. In most existing studies, human mobility networks are typically constructed by aggregating all trips without distinguishing who takes which specific trip. Instead, we claim individual mobility motifs, higher-order structures generated by daily trips of people, as fundamental units of human mobility networks. In this paper, we propose two network construction frameworks at the level of mobility motifs in characterizing regional importance in cities. Firstly, we enhance the structural dependencies within mobility motifs and proceed to construct mobility networks based on the enhanced mobility motifs. Secondly, taking inspiration from PageRank, we speculate that people would allocate values of importance to destinations according to their trip intentions. A motif-wise network construction framework is proposed based on the established mechanism. Leveraging large-scale metro data across cities, we construct three types of human mobility networks and characterize the regional importance by node importance indicators. Our comparison results suggest that the motif-based mobility network outperforms the classic mobility network, thus highlighting the efficacy of the introduced human mobility motifs. Finally, we demonstrate that the performance in characterizing the regional importance is significantly improved by our motif-wise framework.

#6 Merging Parameter Estimation and Classification Using LASSO [PDF] [Copy] [Kimi]

Authors: Le Wang ; Ying Wang ; Yu Qiu ; Mian Li ; Håkan Hjalmarsson

Soft sensing is a way to indirectly obtain information of signals for which direct sensing is difficult or prohibitively expensive. It may not a priori be evident which sensors provide useful information about the target signal. There may be sensors irrelevant for the estimation as well as sensors for which the information is very poor. It is often required that the soft sensor should cover a wide range of operating points. This means that some sensors may be useful in certain operating conditions while irrelevant in others, while others may have no bearing on the target signal whatsoever. However, this type of structural information is typically not available but has to be deduced from data. A further compounding issue is that multiple operating conditions may be described by the same model, but which ones is not known in advance either. In this contribution, we provide a systematic method to construct a soft sensor that can deal with these issues. While the different models can be used, we adopt the multi-input single output finite impulse response models since they are linear in the parameters. We propose a single estimation criterion, where the objectives are encoded in terms of model fit, model sparsity (reducing the number of different models), and model parameter coefficient sparsity (to exclude irrelevant sensors). A post-processing model clustering step is also included. As proof of concept, the method is tested on field test datasets from a prototype vehicle.

#7 Resource Optimization in UAV-assisted IoT Networks: The Role of Generative AI [PDF] [Copy] [Kimi]

Authors: Sana Sharif ; Sherali Zeadally ; Waleed Ejaz

We investigate how generative Artificial Intelligence (AI) can be used to optimize resources in Unmanned Aerial Vehicle (UAV)-assisted Internet of Things (IoT) networks. In particular, generative AI models for real-time decision-making have been used in public safety scenarios. This work describes how generative AI models can improve resource management within UAV-assisted networks. Furthermore, this work presents generative AI in UAV-assisted networks to demonstrate its practical applications and highlight its broader capabilities. We demonstrate a real-life case study for public safety, demonstrating how generative AI can enhance real-time decision-making and improve training datasets. By leveraging generative AI in UAV- assisted networks, we can design more intelligent, adaptive, and efficient ecosystems to meet the evolving demands of wireless networks and diverse applications. Finally, we discuss challenges and future research directions associated with generative AI for resource optimization in UAV-assisted networks.

#8 Playing Games with your PET: Extending the Partial Exploration Tool to Stochastic Games [PDF] [Copy] [Kimi]

Authors: Tobias Meggendorfer ; Maximilian Weininger

We present version 2.0 of the Partial Exploration Tool (PET), a tool for verification of probabilistic systems. We extend the previous version by adding support for stochastic games, based on a recent unified framework for sound value iteration algorithms. Thereby, PET2 is the first tool implementing a sound and efficient approach for solving stochastic games with objectives of the type reachability/safety and mean payoff. We complement this approach by developing and implementing a partial-exploration based variant for all three objectives. Our experimental evaluation shows that PET2 offers the most efficient partial-exploration based algorithm and is the most viable tool on SGs, even outperforming unsound tools.

#9 An Electronically Tunable 28-34 GHz 2-D Steerable Leaky Wave Antenna [PDF] [Copy] [Kimi]

Authors: Mahdi Alesheikh ; Md Hedayatullah Maktoomi ; Soheil Saadat ; Hamidreza Aghasi

In this paper, a 2-D beam steering mm-wave antenna based on the leaky wave configuration is presented. Microstrip leaky wave antennas are known to exhibit beam rotation by changing the frequency. In this work, the microstrip leaky wave antenna is adopted and co-integrated with electronically tunable board components that periodically load the antenna. By independent control of variable capacitors and diodes, single-frequency 2-D beam steering across the bandwidth is achieved. The proposed antenna is fabricated in Rogers printed circuit board technologies and the simulation results exhibit a peak realized gain of 8 dBi, radiation bandwidth of 28-34 GHz, radiation efficiency of more than 80%, and more than 90$^\circ$ and 70$^\circ$ of beam rotation in the $\phi$ and $\theta$ directions.

#10 Roadside Units Assisted Localized Automated Vehicle Maneuvering: An Offline Reinforcement Learning Approach [PDF] [Copy] [Kimi]

Authors: Kui Wang ; Changyang She ; Zongdian Li ; Tao Yu ; Yonghui Li ; Kei Sakaguchi

Traffic intersections present significant challenges for the safe and efficient maneuvering of connected and automated vehicles (CAVs). This research proposes an innovative roadside unit (RSU)-assisted cooperative maneuvering system aimed at enhancing road safety and traveling efficiency at intersections for CAVs. We utilize RSUs for real-time traffic data acquisition and train an offline reinforcement learning (RL) algorithm based on human driving data. Evaluation results obtained from hardware-in-loop autonomous driving simulations show that our approach employing the twin delayed deep deterministic policy gradient and behavior cloning (TD3+BC), achieves performance comparable to state-of-the-art autonomous driving systems in terms of safety measures while significantly enhancing travel efficiency by up to 17.38% in intersection areas. This paper makes a pivotal contribution to the field of intelligent transportation systems, presenting a breakthrough solution for improving urban traffic flow and safety at intersections.

#11 Latency and Energy Minimization in NOMA-Assisted MEC Network: A Federated Deep Reinforcement Learning Approach [PDF] [Copy] [Kimi]

Authors: Arian Ahmadi ; Anders Høst-Madsen ; Zixiang Xiong

Multi-access edge computing (MEC) is seen as a vital component of forthcoming 6G wireless networks, aiming to support emerging applications that demand high service reliability and low latency. However, ensuring the ultra-reliable and low-latency performance of MEC networks poses a significant challenge due to uncertainties associated with wireless links, constraints imposed by communication and computing resources, and the dynamic nature of network traffic. Enabling ultra-reliable and low-latency MEC mandates efficient load balancing jointly with resource allocation. In this paper, we investigate the joint optimization problem of offloading decisions, computation and communication resource allocation to minimize the expected weighted sum of delivery latency and energy consumption in a non-orthogonal multiple access (NOMA)-assisted MEC network. Given the formulated problem is a mixed-integer non-linear programming (MINLP), a new multi-agent federated deep reinforcement learning (FDRL) solution based on double deep Q-network (DDQN) is developed to efficiently optimize the offloading strategies across the MEC network while accelerating the learning process of the Internet-of-Thing (IoT) devices. Simulation results show that the proposed FDRL scheme can effectively reduce the weighted sum of delivery latency and energy consumption of IoT devices in the MEC network and outperform the baseline approaches.

#12 Derisking of subsynchronous torsional oscillations in power systems with conventional and inverter-based generation [PDF] [Copy] [Kimi]

Authors: Nicolas Bonafé ; Julian Freytes ; Hani Saad

This article proposes an application of a derisking methodology of subsynchronous torsional oscillations considering a realistic use case. The main objective is to summarize and draft a synthetic paper clarifying the complete methodology highlighting the main information needed step-by-step. For exemplification, a real model from a decommissioned oil power plant is adopted, where a fictitious high voltage direct current power link is connected. In this article, stress is laid on details of the application of the derisking methods: the unit interaction factor and the complex torque coefficients method. Then, the different steps to obtain results are explicitly explained. Moreover, the design and tuning process of supplementary subsynchronous damping controller is discussed. This mitigation section uses minimal information to correctly damp the unstable oscillations, as one would expect from industrial projects where the data sharing may be limited. Finally, the resources needed to perform each step of the study were summarized.

#13 Optimizing Prosumer Policies in Periodic Double Auctions Inspired by Equilibrium Analysis (Extended Version) [PDF] [Copy] [Kimi]

Authors: Bharat Manvi ; Sanjay Chandlekar ; Easwar Subramanian

We consider a periodic double auction (PDA) wherein the main participants are wholesale suppliers and brokers representing retailers. The suppliers are represented by a composite supply curve and the brokers are represented by individual bids. Additionally, the brokers can participate in small-scale selling by placing individual asks; hence, they act as prosumers. Specifically, in a PDA, the prosumers who are net buyers have multiple opportunities to buy or sell multiple units of a commodity with the aim of minimizing the cost of buying across multiple rounds of the PDA. Formulating optimal bidding strategies for such a PDA setting involves planning across current and future rounds while considering the bidding strategies of other agents. In this work, we propose Markov perfect Nash equilibrium (MPNE) policies for a setup where multiple prosumers with knowledge of the composite supply curve compete to procure commodities. Thereafter, the MPNE policies are used to develop an algorithm called MPNE-BBS for the case wherein the prosumers need to re-construct an approximate composite supply curve using past auction information. The efficacy of the proposed algorithm is demonstrated on the PowerTAC wholesale market simulator against several baselines and state-of-the-art bidding policies.

#14 A 49.8mm2 Fully Integrated, 1.5m Transmission-Range, High-Data-Rate IR-UWB Transmitter for Brain Implants [PDF] [Copy] [Kimi]

Authors: Cong Ding ; Mingxiang Gao ; Anja K. Skrivervik ; Mahsa Shoaran

To address the challenge of extending the transmission range of implantable TXs while also minimizing their size and power consumption, this paper introduces a transcutaneous, high data-rate, fully integrated IR-UWB transmitter that employs a novel co-designed power amplifier (PA) and antenna interface for enhanced performance. With the co-designed interface, we achieved the smallest footprint of 49.8mm2 and the longest transmission range of 1.5m compared to the state-of-the-art IR-UWB TXs.

#15 Long-term usage of the off-grid photovoltaic system with lithium-ion battery-based energy storage system on high mountains: A case study in Payiun Lodge on Mt. Jade in Taiwan [PDF] [Copy] [Kimi]

Author: Hsien-Ching Chung

Energy supply on high mountains remains an open issue since grid connection is unavailable. In the past, diesel generators with lead-acid battery energy storage systems (ESSs) are applied in most cases. Recently, photovoltaic (PV) system with lithium-ion (Li-ion) battery ESS is an appropriate method for solving this problem in a greener way. In 2016, an off-grid PV system with Li-ion battery ESS has been installed in Paiyun Lodge on Mt. Jade (the highest lodge in Taiwan). After operation for more than 7 years, the aging problem of the whole electric power system becomes a critical issue for long-term usage. In this work, a method is established for analyzing the massive energy data (over 7 million rows) and estimating the health of the Li-ion battery system, such as daily operation patterns as well as C-rate, temperature, and accumulated energy distributions. The accomplished electric power improvement project dealing with the power system aging is reported. Based on the long-term usage experience, a simple cost analysis model between lead-acid and Li-ion battery systems is built, explaining that the expensive Li-ion batteries can compete with the cheap lead-acid batteries for long-term usage on high mountains. This case study provides engineers and researchers a fundamental understanding of the long-term usage of off-grid PV ESSs and engineering on high mountains.

#16 Weighted Least-Squares PARSIM [PDF] [Copy] [Kimi]

Authors: Jiabao He ; Cristian R. Rojas ; Håkan Hjalmarsson

Subspace identification methods (SIMs) have proven very powerful for estimating linear state-space models. To overcome the deficiencies of classical SIMs, a significant number of algorithms has appeared over the last two decades, where most of them involve a common intermediate step, that is to estimate the range space of the extended observability matrix. In this contribution, an optimized version of the parallel and parsimonious SIM (PARSIM), PARSIM\textsubscript{opt}, is proposed by using weighted least-squares. It not only inherits all the benefits of PARSIM but also attains the best linear unbiased estimator for the above intermediate step. Furthermore, inspired by SIMs based on the predictor form, consistent estimates of the optimal weighting matrix for weighted least-squares are derived. Essential similarities, differences and simulated comparisons of some key SIMs related to our method are also presented.

#17 A Weighted Least-Squares Method for Non-Asymptotic Identification of Markov Parameters from Multiple Trajectories [PDF] [Copy] [Kimi]

Authors: Jiabao He ; Cristian R. Rojas ; Håkan Hjalmarsson

Markov parameters play a key role in system identification. There exists many algorithms where these parameters are estimated using least-squares in a first, pre-processing, step, including subspace identification and multi-step least-squares algorithms, such as Weighted Null-Space Fitting. Recently, there has been an increasing interest in non-asymptotic analysis of estimation algorithms. In this contribution we identify the Markov parameters using weighted least-squares and present non-asymptotic analysis for such estimator. To cover both stable and unstable systems, multiple trajectories are collected. We show that with the optimal weighting matrix, weighted least-squares gives a tighter error bound than ordinary least-squares for the case of non-uniformly distributed measurement errors. Moreover, as the optimal weighting matrix depends on the system's true parameters, we introduce two methods to consistently estimate the optimal weighting matrix, where the convergence rate of these estimates is also provided. Numerical experiments demonstrate improvements of weighted least-squares over ordinary least-squares in finite sample settings.

#18 Grey-box Recursive Parameter Identification of a Nonlinear Dynamic Model for Mineral Flotation [PDF] [Copy] [Kimi]

Authors: Rodrigo A. González ; Paulina Quintanilla

This study presents a grey-box recursive identification technique to estimate key parameters in a mineral flotation process across two scenarios. The method is applied to a nonlinear physics-based dynamic model validated at a laboratory scale, allowing real-time updates of two model parameters, n and C, in response to changing conditions. The proposed approach effectively adapts to process variability and allows for continuous adjustments based on operational fluctuations, resulting in a significantly improved estimation of concentrate grade - one key performance indicator. In Scenario 1, parameters n and C achieved fit metrics of 97.99 and 96.86, respectively, with concentrate grade estimations improving from 75.1 to 98.69 using recursive identification. In Scenario 2, the fit metrics for n and C were 96.27 and 95.48, respectively, with the concentrate grade estimations increasing from 96.27 to 99.45 with recursive identification. The results demonstrate the effectiveness of the proposed grey-box recursive identification method in accurately estimating parameters and predicting concentrate grade in a mineral flotation process.

#19 Asymmetry of Frequency Distribution in Power Systems: Sources, Impact and Control [PDF] [Copy] [Kimi]

Authors: Taulant Kerci ; Federico Milano

This letter analyses the sources of asymmetry of frequency probability distributions (PDs) and their impact on the dynamic behaviour of power systems. The letter also discusses on how secondary control can reduce this asymmetry. We also propose an asymmetry index based on the difference between the left and right-hand side standard deviations of the frequency PDs. The IEEE 9-bus system and real-world data obtained from the Irish transmission system serve to show that losses, saturation's and wind generation lead to asymmetric PDs. A relevant result is that the droop-based frequency support provided by wind generation using a tight deadband of 15 mHz leads to significantly increase the asymmetry of the frequency PDs.