Data Analysis, Statistics and Probability

2025-05-05 | | Total: 4

#1 Higher-Order Spectra and their Unbiased Estimation in the GPU-accelerated SignalSnap Library [PDF] [Copy] [Kimi] [REL]

Authors: Markus Sifft, Armin Ghorbanietemad, Fabian Wagner, Daniel Hägele

The analysis of time-dependent data poses a fundamental challenge in many fields of science and engineering. While concepts for higher-order spectral analysis like Brillinger's polyspectra for stationary processes have long been introduced, their applications have been limited probably due to high computational cost and complexity of implementation. Here we discuss the theoretical background of estimating polyspectra with our open-source GPU-accelerated SignaSnap library and highlight its advantages over previous implementations: (i) The calculation of spectra is unprecedentedly based on unbiased and consistent estimators that suppress the appearance of false structures in fourth-order spectra. (ii) SignalSnap implements cross-correlation spectra for up to four channels. (iii) The spectral estimates of SignalSnap have a clear relation to Brillinger's definition of ideal spectra of continuous stochastic processes in terms of amplitude and spectral resolution. (iv) SignalSnap estimates the variance of each spectral value. We show how polyspectra reveal, e.g., the correlations between different channels or the breaking of time-inversion symmetry and discuss how quasi-polyspectra uncover the non-stationarity of signals.

Subjects: Data Analysis, Statistics and Probability , Signal Processing , Quantum Physics

Publish: 2025-05-02 12:36:30 UTC


#2 Family-Vicsek universality of the binary intrinsic dimension of nonequilibrium data [PDF1] [Copy] [Kimi] [REL]

Authors: Roberto Verdel, Devendra Singh Bhakuni, Santiago Acevedo

The intrinsic dimension (ID) is a powerful tool to detect and quantify correlations from data. Recently, it has been successfully applied to study statistical and many-body systems in equilibrium. Yet, its application to systems away from equilibrium remains largely unexplored. Here we study the ID of nonequilibrium growth dynamics data, and show that even after reducing these data to binary form, their binary intrinsic dimension (BID) retains essential physical information. Specifically, we find that, akin to the surface width, it exhibits Family-Vicsek dynamical scaling -- a fundamental feature to describe universality in surface roughness phenomena. These findings highlight the ability of the BID to correctly discern key properties and correlations in nonequilibrium data, and open an avenue for alternative characterizations of out-of-equilibrium dynamics.

Subjects: Statistical Mechanics , Computational Physics , Data Analysis, Statistics and Probability

Publish: 2025-05-02 08:58:08 UTC


#3 Low-dimensional representation of brain networks for seizure risk forecasting [PDF] [Copy] [Kimi] [REL]

Authors: Steven Rico-Aparicio, Martin Guillemaud, Alice Longhena, Vincent Navarro, Louis Cousyn, Mario Chavez

Identifying preictal states -- periods during which seizures are more likely to occur -- remains a central challenge in clinical computational neuroscience. In this study, we introduce a novel framework that embeds functional brain connectivity networks, derived from intracranial EEG (iEEG) recordings, into a low-dimensional Euclidean space. This compact representation captures essential topological features of brain dynamics and facilitates the detection of subtle connectivity changes preceding seizures. Using standard machine learning techniques, we define a dimensionless biomarker, B, that discriminates between interictal (seizure-free) and preictal (within 24 hours of seizure) network states. Our method focuses on connectivity patterns among a subset of informative iEEG electrodes, enabling robust classification of brain states across time. We validate our approach using a leave-one-out cross-validation scheme and a pseudo-prospective forecasting strategy, assessing performance with metrics such as F1-score and balanced accuracy. Results show that low-dimensional Euclidean embeddings of iEEG connectivity yield interpretable and predictive markers of preictal activity, offering promising implications for real-time seizure forecasting and individualized therapeutic interventions.

Subjects: Neurons and Cognition , Data Analysis, Statistics and Probability

Publish: 2025-05-01 20:41:50 UTC


#4 On the emergence of numerical instabilities in Next Generation Reservoir Computing [PDF] [Copy] [Kimi] [REL]

Authors: Edmilson Roque dos Santos, Erik Bollt

Next Generation Reservoir Computing (NGRC) is a low-cost machine learning method for forecasting chaotic time series from data. However, ensuring the dynamical stability of NGRC models during autonomous prediction remains a challenge. In this work, we uncover a key connection between the numerical conditioning of the NGRC feature matrix -- formed by polynomial evaluations on time-delay coordinates -- and the long-term NGRC dynamics. Merging tools from numerical linear algebra and ergodic theory of dynamical systems, we systematically study how the feature matrix conditioning varies across hyperparameters. We demonstrate that the NGRC feature matrix tends to be ill-conditioned for short time lags and high-degree polynomials. Ill-conditioning amplifies sensitivity to training data perturbations, which can produce unstable NGRC dynamics. We evaluate the impact of different numerical algorithms (Cholesky, SVD, and LU) for solving the regularized least-squares problem.

Subjects: Machine Learning , Machine Learning , Dynamical Systems , Data Analysis, Statistics and Probability

Publish: 2025-05-01 20:16:44 UTC