Data Analysis, Statistics and Probability

2025-04-18 | | Total: 4

#1 Bayesian model-data comparison incorporating theoretical uncertainties [PDF1] [Copy] [Kimi3] [REL]

Authors: Sunil Jaiswal, Chun Shen, Richard J. Furnstahl, Ulrich Heinz, Matthew T. Pratola

Accurate comparisons between theoretical models and experimental data are critical for scientific progress. However, inferred model parameters can vary significantly with the chosen physics model, highlighting the importance of properly accounting for theoretical uncertainties. In this article, we explicitly incorporate these uncertainties using Gaussian processes that model the domain of validity of theoretical models, integrating prior knowledge about where a theory applies and where it does not. We demonstrate the effectiveness of this approach using two systems: a simple ball drop experiment and multi-stage heavy-ion simulations. In both cases incorporating model discrepancy leads to improved parameter estimates, with systematic improvements observed as additional experimental observables are integrated.

Subjects: High Energy Physics - Phenomenology , Nuclear Theory , Data Analysis, Statistics and Probability

Publish: 2025-04-17 17:53:39 UTC


#2 Compared analysis of DInSAR data from ascending and descending orbits of Sentinel-1: the Cazzaso case study [PDF] [Copy] [Kimi] [REL]

Authors: Giuseppe Buono, Raffaele Nutricato, Paolo Facchi, Luciano Guerriero, Francesco Vincenzo Pepe, Cosmo Lupo, Saverio Pascazio

Differential SAR interferometry (DInSAR), by providing displacement time series over coherent objects on the Earth's surface (persistent scatterers), allows to analyze wide areas, identify ground displacements, and study their evolution at large times. In this work we implement an innovative approach that relies exclusively on line-of-sight displacement time series, applicable to cases of correlated persistent-scatterer displacements. We identify the locus of the final positions of the persistent scatterers and automatically calculate the lower bound of the magnitude of the potential three-dimensional displacements. We present the results obtained by using Sentinel-1 data for investigating the ground stability of the hilly village Cazzaso located in the Italian Alps (Friuli Venezia Giulia region) in an area affected by an active landslide. SAR datasets acquired by Sentinel-1 from both ascending and descending orbits were processed using the SPINUA algorithm. Displacement time series were analysed in order to solve phase unwrapping issues and displacement field calculation.

Subjects: Geophysics , Earth and Planetary Astrophysics , Data Analysis, Statistics and Probability

Publish: 2025-04-02 14:32:41 UTC


#3 Transforming Simulation to Data Without Pairing [PDF] [Copy] [Kimi] [REL]

Authors: Eli Gendreau-Distler, Luc Le Pottier, Haichen Wang

We explore a generative machine learning-based approach for estimating multi-dimensional probability density functions (PDFs) in a target sample using a statistically independent but related control sample - a common challenge in particle physics data analysis. The generative model must accurately reproduce individual observable distributions while preserving the correlations between them, based on the input multidimensional distribution from the control sample. Here we present a conditional normalizing flow model (CNF) based on a chain of bijectors which learns to transform unpaired simulation events to data events. We assess the performance of the CNF model in the context of LHC Higgs to diphoton analysis, where we use the CNF model to convert a Monte Carlo diphoton sample to one that models data. We show that the CNF model can accurately model complex data distributions and correlations. We also leverage the recently popularized Modified Differential Multiplier Method (MDMM) to improve the convergence of our model and assign physical meaning to usually arbitrary loss-function parameters.

Subjects: Data Analysis, Statistics and Probability , High Energy Physics - Experiment , High Energy Physics - Phenomenology

Publish: 2025-04-15 08:12:54 UTC


#4 Maximum Information Extraction From Noisy Data Via Shannon Entropy Minimization [PDF] [Copy] [Kimi] [REL]

Authors: Matteo Becchi, Giovanni Maria Pavan

Granting maximum information extraction in the analysis of noisy data is non-trivial. We introduce a general, data-driven approach that employs Shannon entropy as a transferable metric to quantify the maximum information extractable from noisy data via their clustering into statistically-relevant micro-domains. We demonstrate the method's efficiency by analyzing, as a representative example, time-series data extracted from molecular dynamics simulations of water and ice coexisting at the solid/liquid transition temperature. The method allows quantifying the information contained in the data distributions (time-independent component) and the additional information gain attainable by analyzing data as time-series (i.e., accounting for the information contained in data time-correlations). The approach is also highly effective for high-dimensional datasets, providing clear demonstrations of how considering components/data that may be little informative but noisy may be not only useless but even detrimental to maximum information extraction. This provides a general and robust parameter-free approach and quantitative metrics for data-analysis, and for the study of any type of system from its data.

Subject: Data Analysis, Statistics and Probability

Publish: 2025-04-17 14:54:46 UTC