Statistics

2026-03-24 | | Total: 117

#1 Multiview Graph Fusion with Covariates [PDF1] [Copy] [Kimi] [REL]

Authors: Sharmistha Guha, Jose Rodriguez-Acosta, Ivo Dinov

Joint modeling of multiview graphs with a common set of nodes between views and auxiliary predictors is an essential, yet less explored, area in statistical methodology. Traditional approaches often treat graphs in different views as independent or fail to adequately incorporate predictors, potentially missing complex dependencies within and across graph views and leading to reduced inferential accuracy. Motivated by such methodological shortcomings, we introduce an integrative Bayesian approach for joint learning of a multiview graph with vector-valued predictors. Our modeling framework assumes a common set of nodes for each graph view while allowing for diverse interconnections or edge weights between nodes across graph views, accommodating both binary and continuous valued edge weights. By adopting a hierarchical Bayesian modeling approach, our framework seamlessly integrates information from diverse graphs through carefully designed prior distributions on model parameters. This approach enables the estimation of crucial model parameters defining the relationship between these graph views and predictors, as well as offers predictive inference of the graph views. Crucially, the approach provides uncertainty quantification in all such inferences. Theoretical analysis establishes that the posterior predictive density for our model asymptotically converges to the true data-generating density, under mild assumptions on the true data-generating density and the growth of the number of graph nodes relative to the sample size. Simulation studies validate the inferential advantages of our approach over predictor-dependent tensor learning and independent learning of different graph views with predictors. We further illustrate model utility by analyzing functional connectivity graphs in neuroscience under cognitive control tasks, relating task-related brain connectivity with phenotypic measures.

Subjects: Methodology , Applications

Publish: 2026-03-23 17:12:48 UTC


#2 Identification of physiological shock in intensive care units via Bayesian regime switching models [PDF] [Copy] [Kimi] [REL]

Authors: Emmett B. Kendall, Jonathan P. Williams, Curtis B. Storlie, Misty A. Radosevich, Erica D. Wittwer, Matthew A. Warner

Detection of occult hemorrhage (i.e., internal bleeding) in patients in intensive care units (ICUs) can pose significant challenges for critical care workers. Because blood loss may not always be clinically apparent, clinicians rely on monitoring vital signs for specific trends indicative of a hemorrhage event. The inherent difficulties of diagnosing such an event can lead to late intervention by clinicians which has catastrophic consequences. Therefore, a methodology for early detection of hemorrhage has wide utility. We develop a Bayesian regime switching model (RSM) that analyzes trends in patients' vitals and labs to provide a probabilistic assessment of the underlying physiological state that a patient is in at any given time. This article is motivated by a comprehensive dataset we curated from Mayo Clinic of 33,924 real ICU patient encounters. Longitudinal response measurements are modeled as a vector autoregressive process conditional on all latent states up to the current time point, and the latent states follow a Markov process. We present a novel Bayesian sampling routine to learn the posterior probability distribution of the latent physiological states, as well as develop an approach to account for pre-ICU-admission physiological changes. A simulation and real case study illustrate the effectiveness of our approach.

Subjects: Applications , Methodology , Machine Learning , Other Statistics

Publish: 2026-03-23 17:03:07 UTC


#3 Stable Algorithms Lower Bounds for Estimation [PDF] [Copy] [Kimi] [REL]

Authors: Xifan Yu, Ilias Zadik

In this work, we show that for all statistical estimation problems, a natural MMSE instability (discontinuity) condition implies the failure of stable algorithms, serving as a version of OGP for estimation tasks. Using this criterion, we establish separations between stable and polynomial-time algorithms for the following MMSE-unstable tasks (i) Planted Shortest Path, where Dijkstra's algorithm succeeds, (ii) random Parity Codes, where Gaussian elimination succeeds, and (iii) Gaussian Subset Sum, where lattice-based methods succeed. For all three, we further show that all low-degree polynomials are stable, yielding separations against low-degree methods and a new method to bound the low-degree MMSE. In particular, our technique highlights that MMSE instability is a common feature for Shortest Path and the noiseless Parity Codes and Gaussian subset sum. Last, we highlight that our work places rigorous algorithmic footing on the long-standing physics belief that first-order phase transitions--which in this setting translates to MMSE-instability impose fundamental limits on classes of efficient algorithms.

Subjects: Statistics Theory , Computational Complexity , Data Structures and Algorithms

Publish: 2026-03-23 16:50:24 UTC


#4 Generalized Sequential Monte Carlo Sampling for Redistricting Simulation [PDF] [Copy] [Kimi] [REL]

Authors: Philip O'Sullivan, Kosuke Imai, Cory McCartan

Simulation methods have become important tools for quantifying partisan and racial bias in redistricting plans. We generalize the Sequential Monte Carlo (SMC) algorithm of McCartan and Imai (2023), one of the commonly used approaches. First, our generalized SMC (gSMC) algorithm can split off regions of arbitrary size, rather than a single district as in the original SMC framework, enabling the sampling of multi-member districts. Second, the gSMC algorithm can operate over various sampling spaces, providing additional computational flexibility. Third, we derive optimal-variance incremental weights and show how to compute them efficiently for each sampling space. Finally, we incorporate Markov chain Monte Carlo (MCMC) steps, creating a hybrid gSMC-MCMC algorithm that can be used for large-scale redistricting applications. We demonstrate the effectiveness of the proposed methodology through analyses of the Irish Parliament, which uses multi-member districts, and the Pennsylvania House of Representatives, which has more than 200 single-member districts.

Subjects: Applications , Computers and Society , Probability

Publish: 2026-03-23 16:48:43 UTC


#5 Data Curation for Machine Learning Interatomic Potentials by Determinantal Point Processes [PDF] [Copy] [Kimi] [REL]

Authors: Joanna Zou, Youssef Marzouk

The development of machine learning interatomic potentials faces a critical computational bottleneck with the generation and labeling of useful training datasets. We present a novel application of determinantal point processes (DPPs) to the task of selecting informative subsets of atomic configurations to label with reference energies and forces from costly quantum mechanical methods. Through experiments with hafnium oxide data, we show that DPPs are competitive with existing approaches to constructing compact but diverse training sets by utilizing kernels of molecular descriptors, leading to improved accuracy and robustness in machine learning representations of molecular systems. Our work identifies promising directions to employ DPPs for unsupervised training data curation with heterogeneous or multimodal data, or in online active learning schemes for iterative data augmentation during molecular dynamics simulation.

Subjects: Applications , Machine Learning

Publish: 2026-03-23 16:22:19 UTC


#6 Detecting change regions on spheres [PDF] [Copy] [Kimi] [REL]

Authors: Di Su, Yining Chen, Tengyao Wang

While change point detection in time series data has been extensively studied, little attention has been given to its generalisation to data observed on spheres or other manifolds, where changes may occur within spatially complex regions with irregular boundaries, posing significant challenges. We propose a new class of estimators, namely, Change Region Identification and SeParation (CRISP), to locate changes in the mean function of a signal-plus-noise model defined on $d$-dimensional spheres. The CRISP estimator applies to scenarios with a single change region, and is extended to multiple change regions via a newly developed generic scheme. The convergence rate of the CRISP estimator is shown to depend on the VC dimension of the hypothesis class that characterises the change regions in general. We also carefully study the case where change regions have the geometry of spherical caps. Simulations confirm the promising finite-sample performance of this approach. The CRISP estimator's practical applicability is further demonstrated through two real data sets on global temperature and ozone hole.

Subjects: Methodology , Statistics Theory

Publish: 2026-03-23 15:04:04 UTC


#7 MAGPI: Multifidelity-Augmented Gaussian Process Inputs for Surrogate Modeling from Scarce Data [PDF] [Copy] [Kimi] [REL]

Authors: Atticus Rex, Elizabeth Qian, David Peterson

Supervised machine learning describes the practice of fitting a parameterized model to labeled input-output data. Supervised machine learning methods have demonstrated promise in learning efficient surrogate models that can (partially) replace expensive high-fidelity models, making many-query analyses, such as optimization, uncertainty quantification, and inference, tractable. However, when training data must be obtained through the evaluation of an expensive model or experiment, the amount of training data that can be obtained is often limited, which can make learned surrogate models unreliable. However, in many engineering and scientific settings, cheaper \emph{low-fidelity} models may be available, for example arising from simplified physics modeling or coarse grids. These models may be used to generate additional low-fidelity training data. The goal of \emph{multifidelity} machine learning is to use both high- and low-fidelity training data to learn a surrogate model which is cheaper to evaluate than the high-fidelity model, but more accurate than any available low-fidelity model. This work proposes a new multifidelity training approach for Gaussian process regression which uses low-fidelity data to define additional features that augment the input space of the learned model. The approach unites desirable properties from two separate classes of existing multifidelity GPR approaches, cokriging and autoregressive estimators. Numerical experiments on several test problems demonstrate both increased predictive accuracy and reduced computational cost relative to the state of the art.

Subjects: Machine Learning , Machine Learning

Publish: 2026-03-23 14:49:38 UTC


#8 Cost-Aware Optimized Front-Door Experimental Design [PDF] [Copy] [Kimi] [REL]

Authors: Leopold Mareis, Mathias Drton

Causal effect estimation often succeeds cost-constrained sequential data collection. This work considers multivariate linear front-door models with arbitrary unobserved confounding on treatment and response. We optimize the experimental design by balancing the statistical efficiency and measurement costs through partial data. The full-data efficient influence function for the causal effect is derived, together with the geometry of all observed-data influence functions. This characterization yields a closed-form optimal sampling policy and an estimator to minimize the asymptotic variance of regular asymptotically linear (RAL) estimators within a class of augmented full-data influence functions. The resulting design also covers back-door estimation. In simulations and applications to biological, medical, and industrial datasets, the optimized designs achieve substantial efficiency gains ($5.3\%$ to $31.9\%$) over naive full-sampling strategies.

Subjects: Methodology , Statistics Theory

Publish: 2026-03-23 14:31:18 UTC


#9 Pair-based estimators of infection and removal rates for stochastic epidemic models [PDF] [Copy] [Kimi] [REL]

Authors: Seth D. Temple, Jonathan Terhorst

Stochastic epidemic models can estimate infection and removal rates, and derived quantities such as the basic reproductive number ($R_0$), when both infection and removal times are observed. In practice, however, removal times are often available while infection times are not, and existing methods that rely only on removal times can become unstable or biased. We study inference for stochastic SIR/SEIR models in a partial--observation setting. We develop imputation--based estimators that use a small calibration sample of fully observed infectious periods, derive closed--form expressions for the pairwise exposure terms they require, and use a studentized parametric bootstrap for bias correction and uncertainty quantification. In simulations, removal time--only methods performed poorly in moderate to large $R_0$ scenarios, while observing even tens of complete infectious periods substantially improved the estimation of the infection rate. A reanalysis of the 1861 Hagelloch measles outbreak under simulated missingness recovered stable qualitative differences in transmission between school classes. Based on our results, we advocate for the targeted collection of a modest number of complete infectious periods as a means of improving surveillance in the early stages of an epidemic.

Subject: Methodology

Publish: 2026-03-23 13:59:12 UTC


#10 Unified implementation and comparison of Bayesian shrinkage methods for treatment effect estimation in subgroups [PDF] [Copy] [Kimi] [REL]

Authors: Marcel Wolbers, Miriam Pedrera Gómez, Alex Ocampo, Isaac Gravestock

Evaluating treatment effect heterogeneity across patient subgroups is a fundamental aspect of clinical trial analysis. Yet, these analyses have inherent limitations due to small sample sizes and the substantial number of subgroups investigated. Statisticians in regulatory agencies and pharmaceutical companies have begun considering shrinkage methods grounded in Bayesian statistical theory. These methods incorporate priors on treatment effect heterogeneity, which operationally shrink raw subgroup treatment effect estimates towards the overall treatment effect. Various shrinkage estimators and priors have been proposed, yet it remains unclear which methods perform best. This work provides a unified presentation, software implementation (in the R package bonsaiforest2), and simulation comparison of one-way and global shrinkage methods for continuous, binary, count, and time-to-event endpoints. One-way models fit a separate shrinkage model for each subgrouping variable, whereas global models fit a model including all subgroup indicators at once. Both can derive standardized subgroup-specific treatment effects. Across all simulation scenarios, shrinkage methods outperformed the standard subgroup estimator without shrinkage in terms of mean squared error. They were also more efficient in identifying a non-efficacious subgroup. Global shrinkage models tended to have smaller mean squared error and less dependence on hyperprior parameters than one-way models, but also exhibited slightly larger bias and worse frequentist coverage of associated credible intervals. For both models, hyperprior choices anchored in trial assumptions about the anticipated size of the overall treatment effect performed well. We conclude that some degree of shrinkage is preferable to none and advocate for the routine inclusion of shrunken estimates in clinical forest plots to facilitate more robust decision-making.

Subject: Methodology

Publish: 2026-03-23 13:32:32 UTC


#11 Parsimonious Subset Selection for Generalized Linear Models with Biomedical Applications [PDF] [Copy] [Kimi] [REL]

Authors: Anant Mathur, Benoit Liquet, Samuel Muller, Sarat Moka

High-dimensional biomedical studies require models that are simultaneously accurate, sparse, and interpretable, yet exact best subset selection for generalized linear models is computationally intractable. We develop a scalable method that combines a continuous Boolean relaxation of the subset problem with a Frank--Wolfe algorithm driven by envelope gradients. The resulting method, which we refer to as COMBSS-GLM, is simple to implement, requires one penalized generalized linear model fit per iteration, and produces sparse models along a model-size path. Theoretically, we identify a curvature-based parameter regime in which the relaxed objective is concave in the selection weights, implying that global minimizers occur at binary corners. Empirically, in logistic and multinomial simulations across low- and high-dimensional correlated settings, the proposed method consistently improves variable-selection quality relative to established penalised likelihood competitors while maintaining strong predictive performance. In biomedical applications, it recovers established loci in a binary-outcome rice genome-wide association study and achieves perfect multiclass test accuracy on the Khan SRBCT cancer dataset using a small subset of genes. Open-source implementations are available in R at https://github.com/benoit-liquet/COMBSS-GLM-R and in Python at https://github.com/saratmoka/COMBSS-GLM-Python.

Subjects: Methodology , Computation

Publish: 2026-03-23 13:09:57 UTC


#12 Structural Concentration in Weighted Networks: A Class of Topology-Aware Indices [PDF] [Copy] [Kimi] [REL]

Authors: L. Riso, M. G. Zoia

This paper develops a unified framework for measuring concentration in weighted systems embedded in networks of interactions. While traditional indices such as the Herfindahl-Hirschman Index capture dispersion in weights, they neglect the topology of relationships among the elements receiving those weights. To address this limitation, we introduce a family of topology-aware concentration indices that jointly account for weight distributions and network structure. At the core of the framework lies a baseline Network Concentration Index (NCI), defined as a normalized quadratic form that measures the fraction of potential weighted interconnection realized along observed network links. Building on this foundation, we construct a flexible class of extensions that modify either the interaction structure or the normalization benchmark, including weighted, density-adjusted, null-model, degree-constrained, transformed-data, and multi-layer variants. This family of indices preserves key properties such as normalization, invariance, and interpretability, while allowing concentration to be evaluated across different dimensions of dependence, including intensity, higher-order interactions, and extreme events. Theoretical results characterize the indices and establish their relationship with classical concentration and network measures. Empirical and simulation evidence demonstrate that systems with identical weight distributions may exhibit markedly different levels of structural concentration depending on network topology, highlighting the additional information captured by the proposed framework. The approach is broadly applicable to economic, financial, and complex systems in which weighted elements interact through networks.

Subjects: Machine Learning , Machine Learning

Publish: 2026-03-23 12:41:12 UTC


#13 The Cascade Identity: 2SLS as a Policy Parameter in Capacity-Constrained Settings [PDF1] [Copy] [Kimi] [REL]

Authors: Niklas Bengtsson, Per Engström

A growing literature shows that two-stage least squares (2SLS) with multiple treatments yields coefficients that are difficult to interpret under heterogeneous treatment effects and cross-effects in the first stage. We show that in capacity-constrained allocation systems, these cross-effects are not a nuisance but the source of a clean policy interpretation. When treatments are rationed and the instrument operates on the same margin as the policy of interest, the 2SLS coefficient $β_k$ equals the total societal effect of expanding treatment $k$ by one slot, including all cascading reallocations through the system. The mechanism is general: it applies whenever fixed supply constrains allocation, whether through ranked queues, waitlists, or market-clearing prices. This cascade identity $\mathbf{T} = \mathbfβ$ holds for any first-stage matrix, under arbitrary treatment effect heterogeneity, and requires only instrument relevance and that the instrument operates on the same margin as the policy. The result applies to university admissions, school choice, medical residency matching, public housing, and other rationed allocation settings. We provide an empirical application using lottery-based admission to Swedish university programs and charitable giving as the outcome.

Subjects: Methodology , Econometrics

Publish: 2026-03-23 12:41:01 UTC


#14 On the identifiability of Dirichlet mixture models [PDF] [Copy] [Kimi] [REL]

Authors: Hien Duy Nguyen, Mayetri Gupta

We study identifiability of finite mixtures of Dirichlet distributions on the interior of the simplex. We first prove a shift identity showing that every Dirichlet density can be written as a mixture of $J$ shifted Dirichlet densities, where $J-1$ is the dimension of the simplex support, which yields non-identifiability on the full parameter space. We then show that identifiability is recovered on a fixed-total parameter slice and on restricted box-type regions. On the full parameter space, we prove that any nontrivial linear relation among Dirichlet kernels must involve at least $J$ coefficients sharing a common sign, and deduce that mixtures with fewer than $J$ atoms are identifiable. We further report direct non-identifiability implications for unrestricted finite mixtures of generalized Dirichlet, Dirichlet-multinomial, fixed-topic-matrix latent Dirichlet allocation, Beta-Liouville, and inverted Beta-Liouville models.

Subject: Statistics Theory

Publish: 2026-03-23 12:35:32 UTC


#15 Identifiability and amortized inference limitations in Kuramoto models [PDF] [Copy] [Kimi] [REL]

Authors: Emma Hannula, Jana de Wiljes, Matthew T. Moores, Heikki Haario, Lassi Roininen

Bayesian inference is a powerful tool for parameter estimation and uncertainty quantification in dynamical systems. However, for nonlinear oscillator networks such as Kuramoto models, widely used to study synchronization phenomena in physics, biology, and engineering, inference is often computationally prohibitive due to high-dimensional state spaces and intractable likelihood functions. We present an amortized Bayesian inference approach that learns a neural approximation of the posterior from simulated phase dynamics, enabling fast, scalable inference without repeated sampling or optimization. Applied to synthetic Kuramoto networks, the method shows promising results in approximating posterior distributions and capturing uncertainty, with computational savings compared to traditional Bayesian techniques. These findings suggest that amortized inference is a practical and flexible framework for uncertainty-aware analysis of oscillator networks.

Subjects: Applications , Machine Learning

Publish: 2026-03-23 09:46:10 UTC


#16 Fixed Rank co-Kriging: a model for multivariate spatial prediction [PDF] [Copy] [Kimi] [REL]

Authors: Gaia Caringi, Piercesare Secchi

This work develops a multivariate extension of the Fixed Rank Kriging (FRK) framework for spatial prediction in settings where multiple spatial processes may provide complementary information. The goal is to preserve the computational efficiency, the ability to operate without assuming stationarity over the domain, and the spatial support flexibility of FRK, while incorporating cross-process dependence. To this end, we employ a multiresolution coregionalization structure for the latent spatial effects, in which spatial basis functions are combined with Gaussian Markov Random Field coefficients. An estimation procedure based on the expectation-maximization algorithm is developed, designed to exploit the multiresolution latent structure. Through simulation studies, we examine when the proposed joint modeling is beneficial. We consider cases in which one process is observed more sparsely or is entirely unobserved in a subregion and find that the multivariate formulation is able to borrow information from the more densely observed process, producing coherent and accurate predictions even where direct observations are limited or absent. Finally, the model is applied to the analysis of PM10 concentrations in Northern Italy, illustrating its applicability in a real environmental context.

Subject: Methodology

Publish: 2026-03-23 09:40:07 UTC


#17 CoNBONet: Conformalized Neuroscience-inspired Bayesian Operator Network for Reliability Analysis [PDF] [Copy] [Kimi] [REL]

Authors: Shailesh Garg, Souvik Chakraborty

Time-dependent reliability analysis of nonlinear dynamical systems under stochastic excitations is a critical yet computationally demanding task. Conventional approaches, such as Monte Carlo simulation, necessitate repeated evaluations of computationally expensive numerical solvers, leading to significant computational bottlenecks. To address this challenge, we propose \textit{CoNBONet}, a neuroscience-inspired surrogate model that enables fast, energy-efficient, and uncertainty-aware reliability analysis, providing a scalable alternative to techniques such as Monte Carlo simulations. CoNBONet, short for \textbf{Co}nformalized \textbf{N}euroscience-inspired \textbf{B}ayesian \textbf{O}perator \textbf{Net}work, leverages the expressive power of deep operator networks while integrating neuroscience-inspired neuron models to achieve fast, low-power inference. Unlike traditional surrogates such as Gaussian processes, polynomial chaos expansions, or support vector regression, that may face scalability challenges for high-dimensional, time-dependent reliability problems, CoNBONet offers \textit{fast and energy-efficient inference} enabled by a neuroscience-inspired network architecture, \textit{calibrated uncertainty quantification with theoretical guarantees} via split conformal prediction, and \textit{strong generalization capability} through an operator-learning paradigm that maps input functions to system response trajectories. Validation of the proposed CoNBONet for various nonlinear dynamical systems demonstrates that CoNBONet preserves predictive fidelity, and achieves reliable coverage of failure probabilities, making it a powerful tool for robust and scalable reliability analysis in engineering design.

Subjects: Machine Learning , Machine Learning

Publish: 2026-03-23 08:09:34 UTC


#18 Neyman-Pearson multiclass classification under label noise via empirical likelihood [PDF] [Copy] [Kimi] [REL]

Authors: Qiong Zhang, Qinglong Tian, Pengfei Li

In many classification problems, the costs of misclassifying observations from different classes can be highly unequal. The Neyman-Pearson multiclass classification (NPMC) framework addresses this issue by minimizing a weighted misclassification risk while imposing upper bounds on class-specific error probabilities. Existing NPMC methods typically assume that training labels are correctly observed. In practice, however, labels are often corrupted due to measurement error or annotation, and the effect of such label noise on NPMC procedures remains largely unexplored. We study the NPMC problem when only noisy labels are available in the training data. We propose an empirical likelihood (EL)-based method that relates the distributions of noisy and true labels through an exponential tilting density ratio model. The resulting maximum EL estimators recover the class proportions and posterior probabilities of the clean labels required for error control. We establish consistency, asymptotic normality, and optimal convergence rates for these estimators. Under mild conditions, the resulting classifier satisfies NP oracle inequalities with respect to the true labels asymptotically. An expectation-maximization algorithm computes the maximum EL estimators. Simulations show that the proposed method performs comparably to the oracle classifier under clean labels and substantially improves over procedures that ignore label noise.

Subjects: Methodology , Machine Learning

Publish: 2026-03-23 06:39:09 UTC


#19 Feature Incremental Clustering with Generalization Bounds [PDF1] [Copy] [Kimi] [REL]

Authors: Jing Zhang, Chenping Hou

In many learning systems, such as activity recognition systems, as new data collection methods continue to emerge in various dynamic environmental applications, the attributes of instances accumulate incrementally, with data being stored in gradually expanding feature spaces. How to design theoretically guaranteed algorithms to effectively cluster this special type of data stream, commonly referred to as activity recognition, remains unexplored. Compared to traditional scenarios, we will face at least two fundamental questions in this feature incremental scenario. (i) How to design preliminary and effective algorithms to address the feature incremental clustering problem? (ii) How to analyze the generalization bounds for the proposed algorithms and under what conditions do these algorithms provide a strong generalization guarantee? To address these problems, by tailoring the most common clustering algorithm, i.e., $k$-means, as an example, we propose four types of Feature Incremental Clustering (FIC) algorithms corresponding to different situations of data access: Feature Tailoring (FT), Data Reconstruction (DR), Data Adaptation (DA), and Model Reuse (MR), abbreviated as FIC-FT, FIC-DR, FIC-DA, and FIC-MR. Subsequently, we offer a detailed analysis of the generalization error bounds for these four algorithms and highlight the critical factors influencing these bounds, such as the amounts of training data, the complexity of the hypothesis space, the quality of pre-trained models, and the discrepancy of the reconstruction feature distribution. The numerical experiments show the effectiveness of the proposed algorithms, particularly in their application to activity recognition clustering tasks.

Subjects: Statistics Theory , Machine Learning

Publish: 2026-03-23 05:35:31 UTC


#20 Bayesian inference for ordinary differential equations models with heteroscedastic measurement error [PDF] [Copy] [Kimi] [REL]

Authors: Selva Salimi, David J. Warne, Christopher Drovandi

Ordinary differential equation (ODE) models are widely used to describe systems in many areas of science. To ensure these models provide accurate and interpretable representations of real-world dynamics, it is often necessary to infer parameters from data, which involves specifying the form of the ODE system as well as a statistical model describing the observational process. A popular and convenient choice for the error model is a Gaussian distribution with constant variance. However, the choice may not be realistic in many systems, since the variance of the observational error may vary over time or have some dependence on the system state (heteroscedastic), reflecting changes in measurement conditions, environmental fluctuations, or intrinsic system variability. Misspecification of the error model can lead to substantial inaccuracies of the posterior estimates of the ODE model parameters and predictions. More elaborate parametric error models could be specified, but this would increase computational cost because additional parameters would need to be estimated within the MCMC procedure and may still be misspecified. In this work we propose a two-step semi-parametric framework for Bayesian parameter estimation of ODE model parameters when there exists heteroscedasticity in the error process. The first step applies a heteroscedastic Gaussian process to estimate the time-dependent error, and the second step performs Bayesian inference for the ODE model parameters using the estimated time-dependent error estimated from step one in the likelihood function. Through a simulation study and two real-world applications, we demonstrate that the proposed approach yields more reliable posterior inference and predictive uncertainty compared to the standard homoscedastic models. Although our focus is on heteroscedasticity, the framework could be applied to handle more complex error processes.

Subjects: Methodology , Computation

Publish: 2026-03-23 04:04:32 UTC


#21 Tiny but uniform improvements of adaptive BH procedures via compound e-values [PDF] [Copy] [Kimi] [REL]

Authors: Nikolaos Ignatiadis, Ruodu Wang, Aaditya Ramdas

After the seminal Benjamini-Hochberg (BH) procedure for controlling the false discovery rate (FDR) was proposed, dozens of papers have attempted to improve its power by adapting to the unknown proportion of nulls. We observe that most null proportion estimates are simply compound e-values in disguise, and thus most adaptive FDR procedures can be interpreted as instances of the e-weighted BH (ep-BH) procedure of Ignatiadis, Wang, and Ramdas [2024], i.e., the BH procedure weighted by compound e-values. This lens helps us show that most existing procedures are inadmissible, and we provide uniform improvements to them. While the improvements are small in practice, they still come for free (without additional assumptions), and help unify the literature. We also use our "leave-one-out ep-BH method" to design a new method with finite-sample FDR control for the simultaneous t-test setting.

Subjects: Methodology , Statistics Theory

Publish: 2026-03-22 22:25:14 UTC


#22 Adaptive and robust experimental design for linear dynamical models using Kalman filter [PDF1] [Copy] [Kimi1] [REL]

Authors: Arno Strouwen, Bart M. Nicolaï, Peter Goos

Current experimental design techniques for dynamical systems often only incorporate measurement noise, while dynamical systems also involve process noise. To construct experimental designs we need to quantify their information content. The Fisher information matrix is a popular tool to do so. Calculating the Fisher information matrix for linear dynamical systems with both process and measurement noise involves estimating the uncertain dynamical states using a Kalman filter. The Fisher information matrix, however, depends on the true but unknown model parameters. In this paper we combine two methods to solve this issue and develop a robust experimental design methodology. First, Bayesian experimental design averages the Fisher information matrix over a prior distribution of possible model parameter values. Second, adaptive experimental design allows for this information to be updated as measurements are being gathered. This updated information is then used to adapt the remainder of the design.

Subjects: Methodology , Systems and Control

Publish: 2026-03-22 19:04:47 UTC


#23 A Note on the Output of a Coordinate-Exchange Algorithm for Optimal Experimental Design [PDF] [Copy] [Kimi] [REL]

Authors: Arno Strouwen, Peter Goos

The coordinate-exchange algorithm is commonly used to construct optimal experimental designs. Every execution of the coordinate-exchange algorithm produces a new, seemingly random, order of the selected design points. In this short communication, we study the order of the design points produced by the algorithm and conclude that certain orders appear much more often than others. As a result, an explicit randomization step of the design points is required before conducting an experiment using a design produced by a coordinate-exchange algorithm.

Subjects: Methodology , Computation

Publish: 2026-03-22 18:45:55 UTC


#24 Generalized Discrete Diffusion from Snapshots [PDF] [Copy] [Kimi1] [REL]

Authors: Oussama Zekri, Théo Uscidda, Nicolas Boullé, Anna Korba

We introduce Generalized Discrete Diffusion from Snapshots (GDDS), a unified framework for discrete diffusion modeling that supports arbitrary noising processes over large discrete state spaces. Our formulation encompasses all existing discrete diffusion approaches, while allowing significantly greater flexibility in the choice of corruption dynamics. The forward noising process relies on uniformization and enables fast arbitrary corruption. For the reverse process, we derive a simple evidence lower bound (ELBO) based on snapshot latents, instead of the entire noising path, that allows efficient training of standard generative modeling architectures with clear probabilistic interpretation. Our experiments on large-vocabulary discrete generation tasks suggest that the proposed framework outperforms existing discrete diffusion methods in terms of training efficiency and generation quality, and beats autoregressive models for the first time at this scale. We provide the code along with a blog post on the project page : \href{https://oussamazekri.fr/gdds}{https://oussamazekri.fr/gdds}.

Subjects: Machine Learning , Artificial Intelligence , Computation and Language , Machine Learning

Publish: 2026-03-22 17:58:01 UTC


#25 Closed-form conditional diffusion models for data assimilation [PDF] [Copy] [Kimi] [REL]

Authors: Brianna Binder, Assad Oberai

We propose closed-form conditional diffusion models for data assimilation. Diffusion models use data to learn the score function (defined as the gradient of the log-probability density of a data distribution), allowing them to generate new samples from the data distribution by reversing a noise injection process. While it is common to train neural networks to approximate the score function, we leverage the analytical tractability of the score function to assimilate the states of a system with measurements. To enable the efficient evaluation of the score function, we use kernel density estimation to model the joint distribution of the states and their corresponding measurements. The proposed approach also inherits the capability of conditional diffusion models of operating in black-box settings, i.e., the proposed data assimilation approach can accommodate systems and measurement processes without their explicit knowledge. The ability to accommodate black-box systems combined with the superior capabilities of diffusion models in approximating complex, non-Gaussian probability distributions means that the proposed approach offers advantages over many widely used filtering methods. We evaluate the proposed method on nonlinear data assimilation problems based on the Lorenz-63 and Lorenz-96 systems of moderate dimensionality and nonlinear measurement models. Results show the proposed approach outperforms the widely used ensemble Kalman and particle filters when small to moderate ensemble sizes are used.

Subjects: Machine Learning , Machine Learning , Computational Physics

Publish: 2026-03-22 15:25:23 UTC