Data Analysis, Statistics and Probability

#1 Approach to predicting extreme events in time series of chaotic dynamical systems using machine learning techniques [PDF] [Copy] [Kimi] [REL]

Authors: Alexandre C. Andreani, Bruno R. R. Boaretto, Elbert E. N. Macau

This work proposes an innovative approach using machine learning to predict extreme events in time series of chaotic dynamical systems. The research focuses on the time series of the Hénon map, a two-dimensional model known for its chaotic behavior. The method consists of identifying time windows that anticipate extreme events, using convolutional neural networks to classify the system states. By reconstructing attractors and classifying (normal and transitional) regimes, the model shows high accuracy in predicting normal regimes, although forecasting transitional regimes remains challenging, particularly for longer intervals and rarer events. The method presents a result above 80% of success for predicting the transition regime up to 3 steps before the occurrence of the extreme event. Despite limitations posed by the chaotic nature of the system, the approach opens avenues for further exploration of alternative neural network architectures and broader datasets to enhance forecasting capabilities.

Subjects: Chaotic Dynamics , Data Analysis, Statistics and Probability

Publish: 2025-07-10 15:07:06 UTC

#2 Extracting ORR Catalyst Information for Fuel Cell from Scientific Literature [PDF] [Copy] [Kimi] [REL]

Authors: Hein Htet, Amgad Ahmed Ali Ibrahim, Yutaka Sasaki, Ryoji Asahi

The oxygen reduction reaction (ORR) catalyst plays a critical role in enhancing fuel cell efficiency, making it a key focus in material science research. However, extracting structured information about ORR catalysts from vast scientific literature remains a significant challenge due to the complexity and diversity of textual data. In this study, we propose a named entity recognition (NER) and relation extraction (RE) approach using DyGIE++ with multiple pre-trained BERT variants, including MatSciBERT and PubMedBERT, to extract ORR catalyst-related information from the scientific literature, which is compiled into a fuel cell corpus for materials informatics (FC-CoMIcs). A comprehensive dataset was constructed manually by identifying 12 critical entities and two relationship types between pairs of the entities. Our methodology involves data annotation, integration, and fine-tuning of transformer-based models to enhance information extraction accuracy. We assess the impact of different BERT variants on extraction performance and investigate the effects of annotation consistency. Experimental evaluations demonstrate that the fine-tuned PubMedBERT model achieves the highest NER F1-score of 82.19% and the MatSciBERT model attains the best RE F1-score of 66.10%. Furthermore, the comparison with human annotators highlights the reliability of fine-tuned models for ORR catalyst extraction, demonstrating their potential for scalable and automated literature analysis. The results indicate that domain-specific BERT models outperform general scientific models like BlueBERT for ORR catalyst extraction.

Subjects: Computation and Language , Data Analysis, Statistics and Probability

Publish: 2025-07-10 07:35:12 UTC

#3 Towards Robust Surrogate Models: Benchmarking Machine Learning Approaches to Expediting Phase Field Simulations of Brittle Fracture [PDF] [Copy] [Kimi¹] [REL]

Authors: Erfan Hamdi, Emma Lejeune

Data driven approaches have the potential to make modeling complex, nonlinear physical phenomena significantly more computationally tractable. For example, computational modeling of fracture is a core challenge where machine learning techniques have the potential to provide a much needed speedup that would enable progress in areas such as mutli-scale modeling and uncertainty quantification. Currently, phase field modeling (PFM) of fracture is one such approach that offers a convenient variational formulation to model crack nucleation, branching and propagation. To date, machine learning techniques have shown promise in approximating PFM simulations. However, most studies rely on overly simple benchmarks that do not reflect the true complexity of the fracture processes where PFM excels as a method. To address this gap, we introduce a challenging dataset based on PFM simulations designed to benchmark and advance ML methods for fracture modeling. This dataset includes three energy decomposition methods, two boundary conditions, and 1,000 random initial crack configurations for a total of 6,000 simulations. Each sample contains 100 time steps capturing the temporal evolution of the crack field. Alongside this dataset, we also implement and evaluate Physics Informed Neural Networks (PINN), Fourier Neural Operators (FNO) and UNet models as baselines, and explore the impact of ensembling strategies on prediction accuracy. With this combination of our dataset and baseline models drawn from the literature we aim to provide a standardized and challenging benchmark for evaluating machine learning approaches to solid mechanics. Our results highlight both the promise and limitations of popular current models, and demonstrate the utility of this dataset as a testbed for advancing machine learning in fracture mechanics research.

Subjects: Machine Learning , Data Analysis, Statistics and Probability

Publish: 2025-07-09 19:14:56 UTC

#4 Analysis of Atomic Charge State and Atomic Number for VAMOS++ Magnetic Spectrometer using Deep Neural Networks and Fractionally Labelled Events [PDF] [Copy] [Kimi] [REL]

Authors: M. Rejmund, A. Lemasson

The VAMOS++ magnetic spectrometer is a multi-parametric system that integrates ion optical magnetic elements with a multi-detector stack. The magnetic elements, along with the tracking and timing detectors and the trajectory reconstruction method, provide the analysis of the magnetic rigidity, the trajectory length between the beam interaction point and the focal plane of the spectrometer, and the related velocity and mass-over-charge ratio. The segmented ionization chamber provides the energy measurements necessary to analyze the atomic charge state and atomic number. However, this analysis critically suffers from inherent limitations due to the variable thickness and non-uniformity of the entrance window of the ionization chamber and other detector imperfections. Conventionally, this meticulous, detailed analysis is exceptionally tedious, often requiring several months to complete. We present a novel method utilizing deep neural networks, trained on an experimental dataset with only a small fraction of precisely labeled events for the lowest and best-resolved atomic charge states or numbers. This innovative approach enables the networks to autonomously and accurately classify the remaining events. This method drastically accelerates the acquisition of high-resolution atomic charge state and atomic number spectra, reducing analysis time from months to mere hours. Crucially, by discarding human bias, this approach ensures standardized, optimal, and reproducible results with unprecedented efficiency.

Subjects: Instrumentation and Detectors , Nuclear Experiment , Atomic Physics , Data Analysis, Statistics and Probability

Publish: 2025-06-20 14:52:22 UTC

#1 Approach to predicting extreme events in time series of chaotic dynamical systems using machine learning techniques [PDF] [Copy] [Kimi] [REL]

#2 Extracting ORR Catalyst Information for Fuel Cell from Scientific Literature [PDF] [Copy] [Kimi] [REL]

#3 Towards Robust Surrogate Models: Benchmarking Machine Learning Approaches to Expediting Phase Field Simulations of Brittle Fracture [PDF] [Copy] [Kimi1] [REL]

#4 Analysis of Atomic Charge State and Atomic Number for VAMOS++ Magnetic Spectrometer using Deep Neural Networks and Fractionally Labelled Events [PDF] [Copy] [Kimi] [REL]

#3 Towards Robust Surrogate Models: Benchmarking Machine Learning Approaches to Expediting Phase Field Simulations of Brittle Fracture [PDF] [Copy] [Kimi¹] [REL]