2026-05-07 | | Total: 68
Background: Existing MRI LLM benchmarks rely mainly on review-book multiple-choice questions, where top proprietary models already score highly, limiting discrimination. No systematic benchmark has evaluated vendor-specific scanner operational knowledge central to research MRI practice. Purpose: We developed MRI-Eval, a tiered benchmark for relative model comparison on MRI physics and GE scanner operations knowledge using primary multiple-choice questions (MCQ), with stem-only and primed diagnostic conditions as complementary analyses. Methods: MRI-Eval includes 1365 scored items across nine categories and three difficulty tiers from textbooks, GE scanner manuals, programming course materials, and expert-generated questions. Five model families were evaluated (GPT-5.4, Claude Opus 4.6, Claude Sonnet 4.6, Gemini 2.5 Pro, Llama 3.3 70B). MCQ was primary; stem-only removed options and used an independent LLM judge; primed stem-only tested responses to incorrect user claims. Results: Overall MCQ accuracy was 93.2% to 97.1%. GE scanner operations was the lowest category for every model (88.2% to 94.6%). In stem-only, frontier-model accuracy fell to 58.4% to 61.1%, and Llama 3.3 70B fell to 37.1%; GE scanner operations stem-only accuracy was 13.8% to 29.8%. Conclusion: High MCQ performance can mask weak free-text recall, especially for vendor-specific operational knowledge. MRI-Eval is most informative as a relative comparison benchmark rather than an absolute competency measure and supports caution in using raw LLM outputs for GE-specific protocol guidance.
This paper presents and validates CTseg, a freely available software for brain CT segmentation, spatial normalisation, and volumetrics. CTseg builds on the Multi-Brain generative modelling framework, providing a CT-specific pipeline that produces tissue maps, deformation fields, and brain volume estimates in the same format as SPM's unified segmentation, thereby extending SPM's established analysis chain from MRI to CT. CTseg is designed for routine hospital CT scans without requiring preprocessing or resampling in deployment. Although CTseg has been adopted in clinical research spanning, among other things, stroke, dementia, and brain morphometry, a systematic validation against an independent reference standard has been lacking. Using paired MR/CT head scans, we evaluate CTseg across four dimensions: segmentation accuracy against an MRI-derived silver standard; spatial normalisation consistency through group-average sharpness and voxelwise coefficient of variation; brain volume agreement via intraclass correlation and Bland-Altman analysis; and downstream sex classification performance from normalised tissue maps. As a baseline, we apply SPM's MRI-based unified segmentation directly to the CT images. CTseg significantly outperformed this baseline for segmentation and normalisation, showed stronger TBV agreement, and achieved comparable TIV agreement. CTseg is freely available at https://github.com/WCHN/CTseg, and all experiment code is included in the repository for full reproducibility.
This paper proposes dynamic stability and performance conditions for grid-connected inverter-based resources (IBRs). To this end, we extend the notion of steady-state droop coefficients to dynamic droop coefficients to capture the small-signal dynamics of IBRs and synchronous generators (SGs). Notably, the dynamic droop coefficients can be obtained from input-output data collected at the unit's (e.g., IBR or SG) point of interconnection without requiring prior knowledge of IBR internals or controls structure. To obtain frequency stability conditions, this IBR model is combined with a lightweight dynamic transmission network model that accounts for uncertainty of line dynamics. The resulting stability conditions are highly scalable and, given a few key network parameters, can be verified at the unit level. To make the conditions practical and offer intuitive and illustrative interpretations, we map the frequency stability conditions to bounds on the Bode plot of the dynamic droop coefficient for two broad types of IBR responses. Moreover, our specifications on the dynamic droop coefficient (i) translate basic frequency control ancillary services into verifiable requirements, and (ii) provide insights into the much-debated question of how to certify an IBR as grid-forming (GFM). The results are illustrated using dynamic droop coefficients obtained using detailed simulations of GFM and GFL IBRs as well as SGs.
The reliable operation of large-scale electric power networks is increasingly challenging, particularly with the integration of stochastic renewable generation. In this work, we address the problem of minimizing network transients by optimally modifying the underlying network. We formulate the problem in terms of graph Laplacian matrices and show that, under certain assumptions, the problem is convex. We derive a linear matrix inequality whose feasibility guarantees the existence and uniqueness of phase cohesive steady-state angles; this condition can be directly incorporated as a convex constraint in the optimization framework and we provide several geometric interpretations of the optimization problem. The proposed method is validated on the IEEE 30-bus test system, where results demonstrate that our approach effectively identifies critical links on the network. Dynamic simulations show a significant reduction in network transients and overall improvements across several performance metrics. We explore the sparsity-optimality trade-off using a reweighted $\ell_1$ heuristic.
We externally validated three deep learning models (DenseNet121, ViT-B/32, and ResNet50) for predicting mammographic breast density from breast ultrasound exams on an independent cohort. The external validation set comprised 2,000 ultrasound exams, including 500 cancer cases defined by an initial negative exam (BI-RADS 1 or 2) followed by a cancer diagnosis within 6 months to 10 years, and 1,500 negative controls matched by manufacturer and study year. Performance was measured using patient-level AUROC across four density categories: A (fatty), B (scattered), C (heterogeneous), and D (extremely dense). As a downstream assessment, we also evaluated 10-year risk prediction by incorporating age and AI-derived density into the Tyrer-Cuzick model and comparing performance against a reference model using age and mammography-reported density. All three models performed best in extremely dense breasts (AUROC 0.868-0.899), with strong performance in fatty (0.814-0.838) and scattered density (0.764-0.799), and lower performance in heterogeneously dense breasts (0.699-0.729). DenseNet121 achieved the highest overall performance (micro-averaged AUROC 0.885), and performance across categories was comparable between internal and external testing. For risk modeling, age combined with AI-derived density yielded a lower AUROC than age combined with mammography-reported density (0.541 vs. 0.570; p = 0.23), with no statistically significant difference. These findings indicate that deep learning models generalize well to external data with different racial composition for breast density assessment. While performance is strongest in extremely dense breasts, heterogeneously dense remains more challenging, highlighting the need for targeted optimization.
Gap-closing rate and visual looming swap discriminative dominance depending on deceleration intensity - a finding that reconciles a long-standing conflict in the car-following literature and challenges spacing-centered assumptions in traditional driver behavior models. This study presents a two-stage analytical framework that distinguishes between information availability (kinematic variables measurable in the environment) and information utilization (variables that demonstrably separate driver behavioral patterns), applied to 1,060,119 valid car-following observations from the NGSIM trajectory dataset (2,932 vehicles). Six kinematic features are extracted, and deceleration events are detected under two threshold conditions (-0.5 m/s^2 and -0.3 m/s^2). K-means clustering identifies behavioral modes, and one-way ANOVA with eta-squared effect sizes ranks each feature's discriminative power. Three key findings emerge: (1) threshold selection fundamentally shapes behavioral inference - the stricter threshold yields three interpretable modes while the permissive threshold collapses these to two; (2) hard braking prioritizes gap-closing rate (eta^2 = 0.715) while moderate braking emphasizes visual looming (eta^2 = 0.574); and (3) spacing headway is negligible (eta^2 <= 0.014) across both thresholds. These findings provide empirically grounded candidates for perceptual cue prioritization and have direct implications for ADAS warning system design and autonomous vehicle control.
Deploying large artificial intelligence (AI) models in power electronics often demands high computational resources. Driven by the quantization paradigm, this digest proposes a quantization-aware training (QAT) principle to substantially minimize the number of bits required and simultaneously maximize the accuracy of computations in pre-trained AI models. Considering a pre-trained probabilistic Bayesian Neural Network (BNN) for gear fault diagnosis in motor drives as an example, we quantize its weights and activation functions from floating-point FP32 to low-precision INT8 values, which enhances the computational efficiency by a significant margin of 30-45% (for different model versions) without any compromise in the accuracy and uncertainty estimates. This substantiates a sustainable mechanism of deploying most quantized light-weight AI models into low-cost edge processors for power electronic applications.
Artificial intelligence (AI)-driven fault diagnosis in motor drives often requires significant computational efforts and time for re-training, in addition to the limited knowledge behind the model and suitability of training and learning mechanisms. This work bridges this gap by proposing a structured mechanism of transforming untapped labeled fault data into AI parameters to leverage probabilistic data-driven learning. This novel AI reservoir modeling framework for power electronics not only eliminates exogenous efforts behind learning data patterns and its optimization, but also provides intuitive guidelines for power electronics engineers behind sizing of AI models. This alignment between data and system physics makes the proposed model transparent and interpretable, bridging practical understanding with data-driven learning. Its computational efficiency is demonstrated using experimental data that structured, physics-aware reservoirs achieve higher diagnostic accuracy and clearer explanations than conventional black-box AI methods.
Ambient Internet of Things (A-IoT) targets energy harvesting (EH), battery-less devices as a simple connectivity solution for extensive ultra-low-power deployments. These devices typically face intermittent energy availability, making uplink reports increasingly susceptible to access collisions and energy outages. In this paper, we build upon the cellular standardization of A-IoT and examine the paging-triggered contention-based random access (CBRA) framework for uplink reporting. We analyze the effects of energy availability and collisions on these systems and introduce an EH-aware access control mechanism. In this mechanism, the reader broadcasts an access probability in the paging message, which helps regulate the number of devices attempting random access. Results show that, unlike the baselines, the proposed method scales well under dense deployments by keeping collisions nearly constant, improving access efficiency, and substantially reducing the number of paging rounds required for successful reporting. These results highlight the importance of lightweight reader-side access control for reliable and resource-efficient reporting in A-IoT environments.
Human localization is gaining momentum in security, healthcare, logistics, and smart spaces applications. While global navigation systems are unreliable indoor, device-free (a.k.a. passive) localization methods that exploit human-induced perturbations of radio propagation can be effectively used. This paper investigates the use of a compact full-wave electromagnetic (EM) setup as a fast and reliable tool to simulate indoor Wi-Fi propagation for human sensing. The goal is to provide a practical baseline for validating simplified propagation models, such as diffraction-based descriptions, and to reduce the need for costly measurement campaigns. Two-dimensional attenuation maps from received signal strength are generated and compared in controlled environments, focusing on attenuation statistics and interference patterns. The simulations reproduce the main spatial features, though discrepancies remain due to simplified material characterization. Diffraction-aware refinements are proposed to mitigate these effects. Overall, the approach provides an efficient pre-measurement reference to support device-free system design and to guide experimental planning.
We demonstrate OESCL-band same-wavelength bi-directional transmission over 60 km HCF with 42.5 THz bandwidth, achieving GMIs comparable with the highest unidirectional SMF data-rates in both directions, with an aggregate of 423.7 + 426.5 Tb/s.
Multistatic collaborative sensing eliminates self-interference, achieves spatial diversity gains, and enables wide-range seamless integrated sensing and communication (ISAC). However, conventional data fusion methods suffer from severe error amplification in geometry-sensitive regions. In addition, the conventional analog phased array solution introduces large beam sweeping overhead, whereas the fully digital arrays request high hardware cost. We propose a multistatic sensing framework enabled by a phase-time array (PTA). The rainbow beamforming maps spatial directions to orthogonal frequency division multiplexing (OFDM) subcarriers, achieving wide-angle coverage with a single radio frequency (RF) chain. We develop two parameter-level schemes-a geometry-aware analytical estimator (GDOP-WLS) and a lightweight multilayer perceptron (PF-MLP)-to mitigate the effects of topological singularities. Additionally, an end-to-end signal-level convolutional neural network (SF-CNN) directly estimates target coordinates from raw signals, avoiding cascaded estimation errors. The results demonstrate that the parameter-level schemes ensure robust convergence under adverse geometric conditions with minimal computational latency. Conversely, the signal-level scheme achieves sub-meter precision but requires an increased computational load. Consequently, the proposed framework establishes a scalable solution for collaborative surveillance of unmanned aerial vehicles (UAVs), providing flexible trade-offs among hardware complexity, latency, and accuracy.
The day-ahead electricity market clearing with nonconvex order types can be formulated as a mixed-integer linear program (MILP), but its LP relaxation may provide weak bounds, and exact solutions can become computationally intractable in large-scale or extended market settings. We study a welfare-maximizing clearing model with elementary hourly orders, block orders with logical acceptance constraints, and flexible hourly orders. Starting from a compact MILP formulation, we derive an equivalent completely positive programming (CPP) reformulation via matrix lifting and propose relaxed CPP variants that further reduce the modeling burden while maintaining strong bounds. We then develop tractable doubly nonnegative (DNN) relaxations, including decomposed formulations that exploit the problem structure by using smaller positive semidefinite matrices. To further strengthen these bounds, we introduce reformulation-linearization technique (RLT) inequalities tailored to the decomposed structure. To tackle the challenge of large-scale DNNs, we design an alternating direction method of multipliers (ADMM) with adaptive penalty updates and rigorous dual lower bounds, enabling certified early termination. Computational experiments on synthetic instances show that the proposed DNN+RLT relaxations substantially tighten LP bounds, while decomposition and first-order methods significantly reduce computational effort.
Passivity indices have been widely adopted to derive distributed stability certificates for power systems. Nevertheless, conventional passivity indices remain scalar-valued even for multi-input-multi-output (MIMO) systems, which can introduce excessive conservatism and compromise analysis accuracy. To overcome these limitations, this paper extends the differential passivity index to a matrix-valued formulation that captures both channel-wise passivity properties and inter-channel coupling effects in MIMO subsystems. On this basis, semi-distributed and fully distributed stability criteria are developed for power systems with heterogeneous nonlinear devices. It is shown that system stability is guaranteed when the aggregate passivity excess of devices compensates for the passivity shortage imposed by the network. Furthermore, analytical passivity matrix expressions for typical power system components are derived, facilitating compositional stability analysis. Case studies on a three-bus system and a modified IEEE 118-bus system validate the effectiveness of the proposed framework.
The recent rapid proliferation of renewable energy is fundamentally changing the dynamic operations of power systems, necessitating new approaches to assess stability for these highly nonlinear systems. In this paper, we prove that synchronous machine systems, modeled in the nonlinear dq-frame, possess fundamental dissipativity properties. Specifically, we show passivity from current input to voltage output and a nonlinear negative imaginary property from torque input to rotor angle output. For the nonlinear system shifted around an equilibrium point, we derive explicit conditions for both passivity and the NI property to hold. Finally, we demonstrate that interconnection with passive droop controllers preserves these dissipativity properties with identical supply rates, thereby ensuring closed-loop stability.
This paper investigates equilibrium points and stability in two synchronous machine configurations: (i) a single generator with an impedance load and (ii) two interconnected machines with co-located loads. We consider both abc and dq reference frames to show that the equilibrium condition reduces to a cubic polynomial in the single-machine case and to an 18th- degree polynomial in the two-machine case. For the single-machine system, Lyapunov stability analysis and linearization based stability analysis are carried out. For the two-machine system, local stability is assessed through linearization and eigenvalue analysis. Illustrative examples confirm the existence of multiple equilibria and illustrate the impact of parameter variation on stability. Our results provide insight into the stability of synchronous machine systems.
This paper studies uplink multiuser MIMO with a rotatable antenna (RA) array under imperfect channel state information (CSI), where each base-station antenna can adjust its boresight direction within an angular region. To balance performance and control overhead, we propose a two-timescale design: RA orientations are optimized from statistical CSI on a large timescale, while linear receive combiners are updated per coherence block from linear minimum-mean-squared-error (LMMSE) channel estimates. Under this framework, we derive a closed-form use-and-then-forget (UatF)-based rate expression for maximum-ratio combining (MRC) and a closed-form statistical rate surrogate for weighted zero-forcing (wZF) under imperfect CSI, revealing how RA rotation influences useful signal strength, estimation-error-induced self-interference, and multiuser interference. The analysis shows that the orientation minimizing channel-estimation error differs from the rate-maximizing one, and that MRC and wZF prefer different rotation configurations due to their distinct mechanisms of signal aggregation and error-aware user separation. For the resulting non-convex rotation design problems, we develop a projected-gradient algorithm over a product of spherical caps with explicit derivatives of the required channel statistics and rate metrics. Numerical results verify the accuracy of the large-timescale surrogates and show substantial performance gains from RA optimization.
Solutions to pursuit-evasion and surveillance-evasion differential games are typically computed and expressed using open-loop representations, with the synthesis of feedback strategies significantly less common. We propose a numerical scheme for obtaining feedback strategies for the recently introduced prying-pedestrian surveillance-evasion differential game. The scheme involves computing feedback strategies as input-output maps approximated via neural networks trained using data obtained from open-loop representations of solutions. Simulations show the effectiveness of neural networks trained with an appropriate learning-loss function. Since optimal feedback strategies are discontinuous, as a second contribution, the potential loss/gain of individual players is subsequently studied for players using sample-and-hold feedback compared to continuous-time feedback.
Resilience is becoming crucial for future wireless networks, which must withstand, adapt to, and recover from rare but potentially cascading disruptions. This paper develops a sequential Monte Carlo (SMC) simulation framework for such systems, in which resilience failures are formulated as path-dependent rare events arising from staged degradation and delayed recovery, and are decomposed into semantically interpretable levels defined by a reaction coordinate. Building on this structure, we present a fixed-level splitting approach with budget-aware population control, enabling efficient estimation of rare non-recovery probabilities. We discuss the potential reuse of SMC checkpoints as representative near-critical states for policy evaluation and simulation-based selection. We further extend the methodology to learned stochastic simulation by using generative sequence models as restartable surrogates within data-driven digital twins. We showcase the framework in a delay-critical wireless network use case, where SMC substantially improves over standard Monte Carlo in rare-event regimes with both physical and learned simulators.
While the spatial directivity of multichannel speech enhancement algorithms improves with the number of microphones, fitting large capture arrays into real-world edge devices is typically limited by physical constraints. To overcome this limitation, we propose Spatial-Magnifier, a neural network designed to generate virtual microphone (VM) signals from a limited set of real microphone (RM) measurements. Moreover, we introduce the Spatial Audio Representation Learning (SARL) framework, which leverages estimated VM signals and features to condition a downstream speech enhancement system. Experimental results demonstrate that the proposed framework outperforms existing spatial upsampling baselines across various speech extraction systems, including end-to-end multichannel speech enhancement and neural beamforming. The proposed method nearly recovers the oracle performance achieved when all microphones are available.
Specific Emitter Identification (SEI) provides physical-layer device authentication for wireless communications and Internet of Things (IoT) systems. While deep learning (DL) has significantly advanced SEI performance, label noise severely degrades system reliability in non-cooperative environments. Label noise originates from channel-induced ambiguities, annotation errors, and deliberate data poisoning by intelligent jammers injecting misleading signals. While recent SEI methods attempt to mitigate label noise, they fundamentally rely on corrupted supervised signals to guide sample selection, inevitably leading to confirmation bias and suboptimal feature spaces. To address this challenge, we propose SEI-SHIELD, a robust SEI framework that integrates self-supervised contrastive pre-training with iterative sample selection. Specifically, SEI-SHIELD employs Momentum Contrast (MoCo) with RF-tailored augmentations to extract intrinsically robust, label-independent representations directly from complex-valued I/Q signals. In addition, K-nearest neighbors (KNN)-based noise filtering identifies corrupted samples through neighborhood label consistency analysis in the learned feature space. Furthermore, an iterative rescue mechanism using prediction confidence and prototype cosine similarity progressively recovers correctly labeled hard samples inadvertently discarded during filtering. Comprehensive experiments on the POWDER and ORACLE datasets demonstrate that SEI-SHIELD achieves state-of-the-art (SOTA) accuracy under various noise rates, substantially outperforming existing noise-robust paradigms, including advanced regularization techniques and sample selection frameworks.
We study channel parameter estimation for multiuser orthogonal time frequency space (OTFS) systems in the delay-Doppler (DD) domain. To enable structured parametric estimation, we adopt a multi-user pilot cyclic prefix (MU-PCP) design, which multiplexes users along the Doppler dimension while preserving a separable exponential structure. This structure facilitates high-resolution estimation of fractional delay and Doppler parameters in the multiuser setting. Building on this framework, we extend weighted MUSIC (W-MUSIC) to multiuser OTFS, providing a computationally efficient approach with mild grid dependency, and develop a matrix pencil (MP)-based method that achieves fully grid-independent delay-Doppler parameter estimation. Numerical results demonstrate the effectiveness of the proposed methods and reveal a robustness-complexity tradeoff: W-MUSIC performs better at low SNR, while MP achieves higher estimation accuracy at moderate-to-high SNR with significantly lower computational complexity.
In this paper, we study second-order lag consensus in multi-agent cyber-physical networks subject to random noise and input failures, within a framework modeling the interactions and perceptions between physical twins and digital twins. We propose a lag consensus protocol and establish sufficient conditions for the mean-square (exponential) stability of the resulting stochastic lag error dynamics. The consensus criteria are derived via Lyapunov analysis using the Itô formula, ensuring robustness to random perturbations and intermittent input failures. Numerical examples illustrate the effectiveness of the proposed method.
Understanding the wireless spectrum is a fundamen- tal requirement for intelligent communication systems, however, interpreting spectrograms requires extracting multiple physical attributes and reasoning about signal structure, which is a capability that is not achieved by traditional ML approaches. Recent advances in vision-language models (VLMs) demonstrated the possibility of learning such interpretation capabilities directly from data. This paper investigates whether VLMs can learn this capability from synthetic data alone, and more importantly, whether such learned representations generalize to real over-the- air RF environments. To address this question, we introduce RF-Analyzer, an SDR-to-AI analysis platform that integrates live spectrum captures associated with the corresponding VLM- based interpretation, enabling direct evaluation of VLMs outputs on live over-the-air signals. Using this platform, we assess a model trained exclusively on synthetic spectrogram data with general-purpose baselines. To enable systematic analysis, we establish a benchmark framework comprising three metrics, Physical Attribute Extraction Score (PAES), Prompt Leakage Rate (PLR), and hallucination count, to assess signal understanding and grounding. The obtained results demonstrate that VLMs trained on synthetic spectrogram data can generalize to real RF environments, particularly for extracting physical signal attributes such as spectral occupancy, temporal behavior, and SNR. This indicates that synthetic data is sufficient for learning transferable representations of RF signal structure. However, this generalization is limited due to the fact that synthetic training does not provide reliable semantic grounding without contextual priors. In particular, generalization breaks under conditions that are not covered in the synthetic distribution, particularly low-SNR regimes
This paper addresses the trajectory-tracking problem for discrete-time linear time-invariant systems with bounded parametric uncertainty, subject to hard constraints on system states, control inputs, and input rates. Unlike existing methods, which often consider only partial uncertainty, omit input-rate or state constraints, or focus on regulation problems, this work provides a systematic adaptive model predictive control (MPC) solution for constrained trajectory tracking under full parametric uncertainty. Determining the control input required to achieve zero tracking error under unknown parameters is challenging. Simultaneously, trajectory tracking under uncertainty with input-rate constraints induces temporal coupling in the control sequence, resulting in a time-varying admissible control set and rendering standard recursive feasibility arguments inapplicable. These challenges are overcome by systematically utilizing the estimated system parameters, coupled with a suitably designed adaptive learning process within a reformulated MPC framework. The recursive feasibility of the proposed MPC optimization routine is then rigorously established despite the time-varying admissible control set induced by input-rate constraints. Closed-loop stability is guaranteed via Lyapunov-based analysis, ensuring convergence of the tracking error and boundedness of system states. Simulation results validate the effectiveness of the pr