2026-05-12 | | Total: 162
This thesis develops a theoretical framework to evaluate the monitoring capability of IoBNT networks. We consider a scenario in which nanosensors passively flow in the bloodstream and detect biomarkers associated with potential diseases, reporting their detections to external gateways on the skin that host a monitoring device. The nanosensors thus realize an artificial point-to-point communication channel between the disease region and the monitor: some packets reach the destination directly, while others are lost through vessel paths that bypass the gateway. We evaluate the network's monitoring capability over this artificial channel using the \ac{AoI} concept, which jointly integrates sample generation (at the disease region), carrying (nanosensor travel through vessels), and delivery (nanosensor-to-gateway) as random events. These are modeled through (i) a Markov model that follows cardiovascular physiology and (ii) channel models of reported nanocommunication technologies. We compute the Markov transition probabilities using a cardiovascular simulator built as a low-complexity electric circuit model of the human vessels. For the nanosensor-to-gateway link, we model two well-known schemes: ultrasonic and terahertz channels. Integrating these components within the \ac{AoI} framework, we report information freshness via the average \ac{PAoI} metric. Under realistic physiological and communication assumptions, fresh information appears on the monitor within tens of seconds. The network is therefore suitable for monitoring tissue-level processes such as bacterial infections, while more adequate architectures are needed to monitor cellular-scale processes, which occur on timescales below tens of seconds.
We introduce SMART-HC-VQA, a Sentinel-2-based visual question answering dataset derived from the IARPA SMART Heavy Construction dataset, designed for spatiotemporal analysis of human activity. The dataset transforms construction-site annotations, construction-type labels, temporal-phase labels, geographic metadata, and observation relationships into natural language question-answer triplets. This approach redefines the existing dataset as a temporally extended automatic target recognition and visual question answering (VQA) challenge, considering a fixed geospatial site as a target whose attributes and activity states evolve across sparse satellite observations. Currently, SMART-HC-VQA comprises 21,837 accessible Sentinel-2 image chips, 65,511 single-image VQA examples, and approximately 2.3 million two-image temporal comparison examples generated via our novel Image-Pairwise Combinatorial Augmentation. We detail the workflow for retrieving and processing Sentinel-2 imagery, segmenting large satellite tiles into site-centered images, maintaining traceability to SMART-HC annotations, and analyzing the distributions of site size, observation count, temporal coverage, construction type, and phase labels. Additionally, we describe an implemented multi-image MLLM training framework based on LLaVA-NeXT Mistral-7B, adapted to accept multiple dated image inputs and train on metadata-derived VQA examples. This work offers a reproducible foundation for understanding language-guided remote sensing activities, aiming not only to detect change but also to reason about the ongoing processes, their progression, and potential future developments.
This paper introduces a proactive Unmanned Aerial Vehicle (UAV) mobility management xApp for Open Radio Access Network (O-RAN) Near Real-Time Radio Intelligent Controller (Near-RT RIC) environments, employing Double Deep Q-Network (DDQN) reinforcement learning (RL) enhanced with transfer learning to optimise handover decisions for UAVs operating along predetermined flight trajectories. Unlike reactive approaches that respond to signal degradation, the proposed framework anticipates network conditions and minimises both outage probability and handover frequency through predictive optimisation. The system leverages centralised weight averaging to consolidate knowledge from multiple flight scenarios into a global model capable of generalising to previously unseen operational environments without extensive retraining. A comprehensive evaluation demonstrates that the proposed framework achieves a favourable trade-off between handover frequency and connectivity reliability, reducing handover events by up to 54.6% compared to greedy approaches while maintaining outage probability at practically negligible levels. The results validate the effectiveness of intelligent learning-based approaches for UAV mobility management in next-generation O-RAN architectures, thereby contributing to seamless integration of aerial user equipment into cellular networks.
This paper proposes the joint design of reconfigurable intelligent surfaces (RIS) and zero-forcing (ZF) precoding for the downlink (DL) multiuser multiple-input single-output (MU-MISO) setup in millimeter-wave (mmWave) bands, where ZF is particularly attractive due to its ability to suppress inter-user interference by exploiting the large antenna arrays and sparse directional channels characteristic of mmWave systems. This ensures efficient spatial multiplexing with manageable complexity, making ZF a practical and in modern 5G/6G deployments. However, a careful design is necessary to overcome potential rank deficiency in the channel matrix. For the MU-MISO case, rank deficiency may arise if users exhibit significantly different channel gains or if, being in far-field, they are aligned with the position of the transmitter. On the other hand, the deployment of a RIS introduces artificial scattering which can shape the radio environment to address those situations. We explore the joint design under perfect channel knowledge, assess the impact of imperfect channel estimation on the bit error rate (BER) and propose a robust design of pilot transmissions that equalizes multiuser interference across users in the presence of channel errors in the precoder design. This evaluation shows the advantages of optimized RIS-aided ZF MU-MISO communication for the DL of wireless systems.
Quantitative cardiac magnetic resonance imaging (MRI) enables non-invasive myocardial tissue characterization but relies on robust motion correction within these variable-length, variable-contrast image sequences. Groupwise registration, which simultaneously aligns all images, has shown greater robustness than pairwise registration for motion correction. However, current deep-learning-based groupwise registration methods cannot generalize across MRI sequences: the architecture typically encodes input data as a fixed-length channel stack, which rigidly couples network design to protocol-specific sequence length, input ordering, and contrast dynamics. At inference time, any change in imaging protocols will render the network unusable. In this work, we introduce \emph{\AnyTwoReg}, a new set-based groupwise registration framework that takes a quantitative MRI sequence as an unordered set. This set formulation fundamentally decouples network design from sequence length and input ordering. By utilizing a shared encoder and correlation-guided feature aggregation, \emph{\AnyTwoReg} constructs a permutation-invariant canonical reference for registration, and learns a permutation-equivariant mapping from images to deformation fields. Additionally, we extract contrast-insensitive image features from an existing foundation model to handle extreme contrast variations. Trained exclusively on a single public $T_1$ mapping dataset (STONE, sequence length $L=11$), \AnyTwoReg generalizes to two unseen quantitative MRI datasets (MOLLI, ASL) with variable lengths ($L \in [11, 60]$) and different contrast dynamics. It achieves strong cross-protocol generalization in a zero-shot manner, and consistently improves downstream quantitative mapping quality. Notably, while designed for quantitative MRI sequences, our framework is directly applicable to Cine MRI sequences for inter-cardiac-phase registration.
This paper studies the robustness of type-based multiple access (TBMA) in over-the-air computation (AirComp) under nonparametric estimation, where no prior knowledge of the data distribution is available. While conventional AirComp approaches rely on amplitude modulations and suffer from noise sensitivity, TBMA enables the use of more structured modulation formats that can be exploited for improved performance. We show that the superposition of transmitted signals in TBMA induces a discrete lattice structure in the received signal space, where each lattice point corresponds to the number of devices accessing a given channel resource. By exploiting this structure through nearest-lattice-point projection, noise effects can be substantially suppressed. The proposed technique achieves an exponential decay of the mean squared error (MSE) with respect to the energy-to-noise spectral density ratio, whereas in conventional techniques the MSE only scales inversely with this ratio. Simulation results validate the theoretical findings and demonstrate that TBMA provides a fundamental robustness advantage over traditional AirComp.
This paper addresses the problem of homography estimation using a nonlinear observer designed on the Lie group $\mathbf{SL}(3)$ that exploits the full image information through direct image registration. Unlike traditional feature-based methods, which rely on extensive feature extraction and matching, the proposed approach formulates an observer that minimises a cost function defined directly in terms of image pixel intensities. Explicit conditions ensuring the non-degeneracy of the cost function are derived, and a comprehensive analysis is conducted to characterise and generate degenerate (unobservable) image configurations. Theoretical results demonstrate local exponential convergence of the observer. To improve local convergence properties, a second-order observer variant is introduced by incorporating the Hessian of the cost function into the correction term. Simulation results demonstrate the performance of the proposed solutions on real images.
Type 1 diabetes eliminates the body's ability to produce insulin, making glucose regulation entirely dependent on external insulin delivery and the control algorithm. Existing closed-loop methods either rely on accurate patient-specific models or do not provide formal safety guarantees, and are often computationally demanding for wearable devices. This paper proposes Glycemic Safety Tube Control (GSTC), a model-free and computationally efficient control framework for automated insulin delivery. The method enforces clinically relevant safety bounds on glucose levels by design, ensuring that glucose remains within a prescribed safe range. We also derive feasibility conditions that guarantee safety and input constraint satisfaction under bounded meal disturbances and estimation errors. The performance of GSTC is evaluated against state-of-the-art methods, including linear and nonlinear model predictive control and sliding mode control. The results demonstrate that GSTC maintains safety under varying meal patterns and patient conditions, highlighting its robustness and computational efficiency. Overall, GSTC provides a safe, efficient, and patient-independent approach for next-generation artificial pancreas systems.
We consider the problem of reconstructing the state of a network of nonlinear dynamical systems in the presence of directed higher-order interactions. Grounded on analytical convergence results, we propose an algorithmic observer design procedure that simultaneously selects the nodes to be measured and the observer gains. We complement the theoretical analysis with an exhaustive numerical investigation campaign that showcases the performance and robustness of the designed observer. Finally, the algorithmic procedure is used to fully reconstruct the opinions of a group of agents.
Event-triggered control provides a mechanism for avoiding excessive use of constrained communication bandwidth in networked multi-agent systems. However, most existing methods rely on accurate system models, which may be unavailable in practice. In this work, we propose a model-free, priority-driven reinforcement learning algorithm that learns communication priorities and control policies jointly from data in decentralized multi-agent systems. By learning communication priorities, we circumvent the hybrid action space typical in event-triggered control with binary communication decisions. We evaluate our algorithm on benchmark tasks and demonstrate that it outperforms the baseline method.
A hierarchical 2DOF (2-degree-of-freedom) structure combining Youla-Kucera (YK) parameterization and model predictive control (MPC) is presented in this paper. The YK parameterization employs the coprime factorization of the nominal system and controller, thereby introducing an auxiliary feedforward channel dedicated to system optimization and a controller parameterization channel. The feedforward channel is utilized to implement cascaded MPC for system optimization. The controller parameterization channel is utilized to achieve offset-free MPC by designing an appropriate YK parameter through the H2 optimal controller design.
Reconstructing a 3D sound field from sparse microphone measurements is a fundamental yet ill-posed problem, which we address through Acoustic Transfer Function (ATF) magnitude estimation. ATF magnitude encapsulates key perceptual and acoustic properties of a physical space with applications in room characterization and correction. Although recent generative paradigms such as Flow Matching (FM) have achieved state-of-the-art performance in speech and music generation, their potential in spatial audio remains underexplored. We propose a novel framework for 3D ATF magnitude reconstruction as a guided generation task, with a 3D U-Net conditioned by a permutation-invariant set encoder. This architecture enables reconstruction from an arbitrary number of sparse inputs while leveraging the stable and efficient training properties of FM. Experimental results demonstrate that SF-Flow achieves accurate reconstruction up to \SI{1}{kHz}, trains substantially faster than the autoencoder baseline, and improves significantly with dataset size.
Ray tracing (RT) has recently gained renewed interest in wireless communications, driven by its integration into digital twin (DT) frameworks for site specific channel modeling. Several previous studies have validated RT at the channel level, yet how these errors propagate into real 5G system level key performance indicators (KPIs) on actual hardware remains unquantified. This paper addresses this gap by comparing Sionna RT simulated channels against vector network analyzer (VNA) measured channels using an OpenAirInterface (OAI) 5G NR testbed. Channel measurements are conducted at 20 receiver positions in an indoor laboratory, with both channel types injected into a hardware in the loop channel emulator interfacing an OAIBOX MAX base station and a Quectel UE. RSRP, PUCCH SNR, and SINR are evaluated under both conditions. The results identify antenna near-field transition effects as a critical position-dependent error source, alongside material property mismatch, providing a quantitative benchmark for digital twin-based 5G and beyond network planning.
In this paper, we develop a communication-oriented complex baseband equivalent model for superheterodyne Rydberg atomic quantum receivers (RAQRs). The model explicitly captures photodetection-induced signal-dependent shot noise and its coupling with the optical operating point. By leveraging an atomic superheterodyne architecture and a strong local oscillator, we construct a complex baseband representation for both the received signal and the signal-dependent shot noise under both direct incoherent optical detection and balanced coherent optical detection. The derived model reveals that the optical operating point jointly determines the normalized effective receive gain and the equivalent noise background, thereby establishing a traceable gain-noise tradeoff governed by system design. More importantly, the proposed model shows that neglecting signal-dependent shot noise may lead to inaccurate operating-point design. Finally, by extending to the multiple-input-multiple-output (MIMO) case, we derive a lower bound on the achievable rate while considering the signal-dependent shot noise. Our analysis \textcolor{black}{reveals} that the non-zero asymptotic rate of RAQ-MIMO and its superiority over conventional RF-MIMO hinge on the normalized noise floor of the RAQ receive chain falling below that of RF MIMO. Simulation results validate our analysis and yield practical, closed-form design guidelines for RAQR front ends, revealing parameter regimes in which RAQ-MIMO outperforms conventional MIMO systems.
Conventional focusing methods for Synthetic Aperture Radar (SAR) employ block processing efficiently but remain latency-heavy processes that prevent the realisation of a closed-loop cognitive SAR vision system. We present the first Online SAR Processor (OSP), an online image-formation framework that treats SAR sensing as a stream and produces focused SAR image output line by line during acquisition. OSP uses a tiny state-space surrogate model trained with teacher-student distillation and multi-stage losses. We evaluate the method on 300GB of SAR data from Maya4, a Sentinel-1-derived dataset containing raw, range-compressed, range-cell-migration-corrected, and azimuth-compressed products. Relative to a linewise digital-signal-processing baseline, OSP delivers approximately 70$\times$ lower latency and 130$\times$ lower memory use; on a single AMD CPU core it processes one row in 16 ms with a memory footprint of 6 MB whilst maintaining a focusing quality high enough to support downstream decisions, which we illustrate with vessel detection and flood-mapping tasks.
Accurate channel estimation remains challenging in high-mobility wireless systems because Doppler shifts induce severe inter-carrier interference (ICI) in Orthogonal Frequency Division Multiplexing (OFDM). We propose an unsupervised online channel estimation framework based on Implicit Neural Representation (INR). Unlike discrete-grid estimators, the proposed method decouples channel representation from the OFDM sampling resolution by modeling the time-varying frequency-selective channel as a continuous function of time-frequency coordinates. A Sinusoidal Representation Network (SIREN) with Gaussian Fourier feature mapping captures fine-grained channel variations and high-frequency details without offline pre-training or labeled data. For each received slot, the network parameters are updated by per-slot online fitting that minimizes a physics-aware ICI loss, while a confidence-aware decision-directed loop balances reliable pilots and dynamically harvested pseudo-pilots. Simulations in realistic Vehicle-to-Everything (V2X) environments show that the proposed method achieves near-optimal link-level reliability, significantly outperforming Least Squares (LS) and robust Linear Minimum Mean Square Error (LMMSE) estimators. Compared with supervised deep learning baselines, it also exhibits strong out-of-distribution (OOD) robustness under environmental distribution shifts, establishing an adaptable data-efficient physical-layer paradigm.
This paper introduces and analyzes Spatial Phase Manifold Communications (SPMC), a paradigm that facilitates joint communication and sensing (JCAS) over Local Oscillator (LO) free receiver. Information is embedded in, and recovered from, the relative spatial phase between antennas. In contrast to conventional coherent receivers that rely on LOs and on channel estimation/equalization, SPMC exploits antenna-domain correlation to form a baseband observable that is a function of inter-antenna phase differences. Since these phase differences are fundamentally tied to Direction-of-Arrival (DoA) and vice-versa, the formulation recasts communication and sensing as inference over the unit-circle manifold and thus naturally supports JCAS decomposition, i.e., data and spatial sensing are encoded and recovered through DoA signatures. We develop a comprehensive framework comprising: (i) a manifold-domain signal model and corresponding phase-alphabet design; (ii) an LO-free quadrature spatial-correlator receiver architecture that resolves the phase-sign ambiguity without requiring an LO; and (iii) an analysis of error probability and sensing precision, including robustness to phase noise. The proposed paradigm is particularly suited to massive Internet-of-Things (IoT) deployments, for which hardware simplicity, LO distribution cost, power consumption, and seamless sensing integration are critical, especially at millimeter-wave and higher carrier frequencies.
In this paper, we present a learning-based control for a class of nonlinear systems that guarantees exponential stability as well as bounded output errors. The control is based on the Gaussian Process Submodel Online Learning (GPSOL) algorithm and the Disturbance Error Rate Limiting (DERL) algorithm, both of which were developed in previous work. The GPSOL algorithm provides a method to learn Gaussian Process (GP) models for subsystems online, whereas the DERL algorithm allows to limit the rate of the prediction error of these GP models. The focus of this paper is the utilization of the GP model within an adaptive controller and the derivation of corresponding stability conditions and system peak-to-peak gains by means of linear matrix inequalities (LMIs). These peak-to-peak gains are then used to prescribe a desired prediction error rate for the DERL algorithm to achieve user-defined output error bounds. The gains and the related bounds were successfully verified using a simulation model. Furthermore, results form a successful experimental validation of the bounds and the overall control structure on a pneumatic test rig are presented. While the control scheme and error bounds proposed in this paper are limited to first-order single-input-single-output systems, an extension to certain classes of higher-order and multiple-input-multiple-output systems is expected to be forthcoming.
The applications of Digital Twins (DT) and Generative AI (GenAI) have demonstrated their capabilities in modeling and learning-based wireless communications. However, their joint potential for proactive wireless system design remains largely underexplored, particularly in extremely large-scale multiple-input multiple-output (XL-MIMO) networks, characterized by hybrid near-field (NF) and far-field (FF) propagation regimes. In this work, we propose an integrated GenAI-enhanced DT framework for proactive interference management in dynamic indoor scenarios. The DT constructs a high-resolution, site-specific virtual replica of the deployment environment, understanding where and why blockage occurs within a realistic 3D representation of the indoor space. Integration of the GenAI module further assists the framework in anticipating and proactively suppressing blockage, rather than reacting after the disruption occurs. Extensive simulation results based on Sionna ray-tracing datasets demonstrate that the proposed framework achieves significant improvements in interference suppression, signal-to-interference-plus-noise ratio (SINR), and outage probability compared to conventional reactive schemes and purely deterministic DT-based approaches.
Transmission Topology Optimization has great potential to improve efficiency and flexibility of grid operations through non-costly switching actions, but previous approaches struggle with runtime performance and scalability. In this work, we present an optimization approach that leverages GPU acceleration to speed up computations. In a genetic algorithm setting, topologies are randomly mutated and evaluated in parallel for multiple optimization criteria. Combined with a fully GPU-native DC loadflow solver, there is no CPU-GPU data transfer required in the DC optimization loop. Using a variant of the illumination algorithm MapElites, we efficiently generate a set of diverse candidate solutions on the pareto front. Together with an importing and AC validation step, we present an end-to-end optimization solution that runs in under 15 minutes. The approach is currently under evaluation by operational planning operators in two European TSOs. We furthermore open-source our code at github.com/eliagroup/ToOp.
Multi-sensor integration via error-state Kalman filter (KF) is widely employed for precise state estimation in cyber-physical systems (CPSs). However, this integration exposes the system to stealthy deception attacks that render conventional detection mechanisms ineffective. We propose an exposure framework to actively reveal such stealthy attacks without modifying sensor interfaces. The framework introduces a suspect mode in which the defender injects random exposure shakes into the nominal control inputs, thus creating a discrepancy between the defender's true state estimates and the attacker's manipulated state estimates, preventing the attack from remaining stealthy. We further derive an explicit exposure condition that characterizes the minimum shake magnitude to guarantee the finite-time exposure and a compensable condition that ensures the shakes do not degrade closed-loop performance. Simulation results based on a GNSS/INS-integrated UAV system verify the effectiveness of the proposed framework.
The performance of audio latent diffusion models is primarily governed by generator expressivity and the modelability of the underlying latent space. While recent research has focused primarily on the former, as well as improving the reconstruction fidelity of audio codecs, we demonstrate that latent modelability can be significantly improved through explicit factor disentanglement. We present PoDAR (Power-Disentangled Audio Representation), a framework that utilizes a randomized power augmentation and latent consistency objective to decouple signal power from invariant semantic content. This factorization makes the latent space easier to model, which both accelerates the convergence of downstream generative models and improves final overall performance. When applied to a Stable Audio 1.0 VAE with an F5-TTS generator, PoDAR achieves about a $2\times$ acceleration in convergence to match baseline performance, while increasing final speaker similarity by 0.055 and UTMOS by 0.22 on the LibriSpeech-PC dataset. Furthermore, isolating power into dedicated channels enables the application of CFG exclusively to power-invariant content, effectively extending the stable guidance regime to higher scales.
A ray-tracing (RT) enhanced back-projection algorithm (RT-BPA) for microwave imaging in multipath environments is presented. By tightly incorporating the concept of ray-tracing into a generalized version of traditional BPA, this method ensures improved image quality by addressing two important issues. First, when the line-of-sight (LOS) path is obstructed, reflected paths, if available, enable imaging of hidden targets, which extends the applicability of the standard BPA beyond its normal use case. Second, the consideration of reflected ray-paths is equivalent to virtually increasing the aperture size, thus, improving image resolution without requiring new measurements. A key factor in achieving these advancements is the consideration of the vector nature of electromagnetic waves with polarization-dependent phase compensation, which is often ignored when employing a scalar-wave based formulation of the electromagnetic vector field. In addition, the presented method employs a shooting and bouncing rays (SBR) framework, offering better flexibility compared to manual path evaluation in existing RT-BPAs.
This paper studies secondary frequency control in transmission networks subject to communication delays at the cyber-physical interface and limited per-update computation at the control center. The regulation objective is formulated as a constrained economic dispatch problem incorporating generation capacity constraints, nodal power balance, transmission-flow limits, and scheduled tie-line power exchanges. Based on this formulation, we develop a passivity-based control framework in which an augmented projected primal-dual controller restores nominal frequency and drives the closed-loop system to the solution set of the constrained economic dispatch problem. Two-way communication delays between the physical network and the control center are modeled as scattering-based passive channels for the measurement uplink and the control-command downlink. This construction preserves the target equilibrium and enables a delay-robust passivity analysis of the delayed closed loop. To reduce the computational burden at the control center, we develop a randomized block-coordinate implementation of the augmented projected primal-dual controller. The resulting sampled-data closed loop preserves the target solution set and achieves local mean-square geometric convergence under suitable step-size and regularity conditions. Finally, a multivariable wave-domain interface filter is introduced to inject additional dissipation and improve the damping of the delayed interface without altering the steady-state interconnection. Simulations on the IEEE 14-bus system indicate that the proposed digital implementation accurately reproduces the delayed closed-loop behavior while reducing the per-update computational cost.
This paper proposes a framework for secure and resilient controller design for positive systems against cyber-attacks. In particular, we consider a network-controlled system where an adversary injects false data into the actuator channels to increase the control cost (performance measure) while penalizing the attack effort and subject to state-dependent constraints. Using a minimax formulation, we analyze the worst-case performance loss caused by such adversaries, which is given by the solution of a difference equation, and an algebraic equation when the time horizon is infinite. We show that the optimal attack policy, among possible nonlinear policies, is linear. Despite the lack of explicit stealthiness constraints, we also show that when the measured output has an unstable zero which is not an unstable zero of the performance measure, the attacks can induce unbounded performance degradation. The proposed framework is also extended to systems with model uncertainty. Numerical examples illustrate the results and demonstrate how tools from positive systems and linear regulator theory can be used to mitigate cyber-attacks with low computational effort.