Applications

2026-03-13 | | Total: 12

#1 On the Unit Teissier Distribution: Properties, Estimation Procedures and Applications [PDF] [Copy] [Kimi] [REL]

Authors: Zuber Akhter, Mohamed A. Abdelaziz, M. Z. Anis, Ahmed Z. Afify

The Teissier distribution, originally proposed by Teissier [31], was designed to model mortality due to aging in domestic animals. More recently, Krishna et al. [19] introduced the Unit Teissier (UT) distribution on the interval (0, 1) through the transformation $X=e^{-Y}$, where $Y$ follows the Teissier distribution. In their work, the authors derived several fundamental properties of the UT distribution and investigated parameter estimation using maximum likelihood, least squares, weighted least squares and Bayesian methods. Building upon this work, the present paper develops additional theoretical and inferential results for the UT distribution. In particular, closed-form expressions for single moments of order statistics and L-moments are obtained, and characterization results based on truncated moments are established. Furthermore, several alternative parameter estimation methods are considered, including maximum product of spacings, Cramér-von Mises, Anderson-Darling, right-tail Anderson-Darling, percentile and L-moment estimation, while the estimation methods previously studied by Krishna et al. [19] are also included for comparison. Extensive simulation studies under various parameter settings and sample sizes are conducted to assess and compare the performance of the estimators. Finally, the flexibility and practical utility of the UT distribution are demonstrated using a real dataset.

Subjects: Applications , Statistics Theory , Computation , Methodology

Publish: 2026-03-12 08:49:55 UTC


#2 One-Shot Individual Claims Reserving [PDF] [Copy] [Kimi] [REL]

Authors: Ronald Richman, Mario V. Wüthrich

Individual claims reserving has not yet become established in actuarial practice. We attribute this to the absence of a satisfactory methodology: existing approaches tend to be either overly complex or insufficiently flexible and robust for practical use. Building on the classical chain-ladder (CL) method, we introduced a new perspective on individual claims reserving in Richman and Wüthrich [arXiv:2602.15385]. This manuscript has sparked considerable discussion within the actuarial community. The aim of the present paper is to continue and deepen that discussion, with the ultimate goal of advancing toward a new standard for micro-level reserving.

Subjects: Applications , Risk Management

Publish: 2026-03-12 08:28:48 UTC


#3 Finite-Sample Decision Instability in Threshold-Based Process Capability Approval [PDF] [Copy] [Kimi] [REL]

Authors: Fei Jiang, Lei Yang

Process capability indices such as $C_{pk}$ are widely used in manufacturing quality control to support supplier qualification and product release decisions based on fixed acceptance thresholds (e.g., $C_{pk} \geq 1.33$). In practice, these decisions rely on sample-based estimates computed from moderate sample sizes ($n \approx$ 20-50), yet the stochastic nature of the estimator is often overlooked when interpreting threshold compliance. This study establishes a local asymptotic characterization of decision behavior when the true process capability lies near a fixed threshold. Under standard regularity conditions, if the true capability equals the threshold, the acceptance probability converges to 0.5 as sample size increases, implying that a fixed $C_{pk}$ gate embeds an inherent boundary decision risk even under ideal distributional assumptions. When the true capability deviates from the threshold by $O(n^{-1/2})$, the decision probability converges to a non-degenerate limit governed by a scaled signal-to-noise ratio. Monte Carlo simulations and an empirical study on 880 manufacturing dimensions demonstrate substantial resampling-based decision instability near the commonly used 1.33 criterion. These findings provide a probabilistic interpretation of threshold-based capability decisions and quantitative guidance for assessing boundary-induced release risk in engineering practice.

Subjects: Applications , Statistics Theory

Publish: 2026-03-11 21:22:47 UTC


#4 A Statistically Reliable Optimization Framework for Bandit Experiments in Scientific Discovery [PDF] [Copy] [Kimi] [REL]

Authors: Tong Li, Travis Mandel, Goldie Phillips, Anna Rafferty, Eric M. Schwartz, Dehan Kong, Joseph J. Williams

Scientific experimentation is largely driven by statistical hypothesis testing to determine significant differences in interventions. Traditionally, experimenters allocate samples uniformly between each intervention. However, such an approach may lead to suboptimal outcomes - multi-armed bandits (MABs) addresses this problem by allocating samples adaptively to maximize outcomes. Yet, two challenges have hindered the use of MABs in scientific domains. First, common hypothesis tests (e.g., $t$-tests) become invalid under adaptive sampling without correction, leading to inflated type~I and type~II errors. This is an understudied problem, and prior solutions suffer from issues such as low statistical power which prevent adoption in many practical settings. Second, practitioners must explicitly balance cumulative reward with statistical efficiency, yet no general methodology exists to quantify this trade-off across algorithms. In this paper, we study assumption modification and critical region correction approaches for hypothesis testing that enable common tests to be applied to adaptively collected data. We provide heuristic justification for its power efficiency and show in simulation that it achieves higher power than existing approaches. Further, we derive a theoretically and practically motivated objective function for adaptive experiment evaluation, which we integrate into a unified experimental framework. Our framework asks experimenters to specify an experiment extension cost for their problem, and based on that enables our proposed optimization procedure to select the bandit algorithm that best balances reward and power in their setting. We show that our approach enables practitioners to improve outcomes with only slightly more steps than uniform randomization, while retaining statistical validity.

Subject: Applications

Publish: 2026-03-11 19:54:03 UTC


#5 Deriving the term-structure of loan write-off risk under IFRS 9 by using survival analysis: A benchmark study [PDF] [Copy] [Kimi] [REL]

Authors: Arno Botha, Mohammed Gabru, Marcel Muller, Janette Larney

The estimation of marginal loan write-off probabilities is a non-trivial task when modelling the loss given default (LGD) risk parameter in credit risk. We explore two types of survival models in estimating the overall write-off probability over default spell time, where these probabilities form the term-structure of write-off risk in aggregate. These survival models include a discrete-time hazard (DtH) model and a conditional inference survival tree. Both models are compared to a cross-sectional logistic regression model for write-off risk. All of these (first-stage) models are then ensconced in a broader two-stage LGD-modelling approach, wherein a loss severity model is estimated in the second stage. In expanding the model suite, a novel dichotomisation step is introduced for collapsing the write-off probability into a 0/1-value, prior to LGD-calculation. A benchmark study is subsequently conducted amongst the resulting LGD-models. We find that the DtH-model outperforms other two-stage LGD-models admirably across most diagnostics. However, a single-stage LGD-model still had the best results, likely due to the peculiar `L-shaped' LGD-distribution in our data. Ultimately, we believe that our tutorial-style work can enhance LGD-modelling practices when estimating the expected credit loss under IFRS 9.

Subjects: Risk Management , Applications

Publish: 2026-03-12 13:14:12 UTC


#6 Effective Degrees of Freedom for Balanced Repeated Replication and Paired Jackknife Variance Estimates: A Unified Approach via Stratum Contrasts [PDF] [Copy] [Kimi] [REL]

Author: Matthias von Davier

Balanced repeated replication (BRR) and the jackknife are two widely used methods for estimating variances in stratified samples with two primary sampling units per stratum. While both methods produce variance estimators that can be expressed as sums of squared stratum-level contrasts, they differ fundamentally in their construction and in the dependence structure of their replicate estimates. This article examines the independence properties of the components contributing to these variance estimators. For BRR, we show that although the replicate estimates themselves are correlated, the balancing property of Hadamard matrices collapses the variance estimator into a sum of independent stratum-specific components. For the jackknife, the independence of components follows directly from the construction. Using these independence results, we derive the variance of each variance estimator and establish a direct connection to the Welch-Satterthwaite degrees of freedom approximation. This yields a practical formula for estimating degrees of freedom when constructing confidence intervals for population totals. The derivation highlights the unified treatment of both replication methods and provides insights into their relative efficiency and applicability.

Subjects: Methodology , Applications

Publish: 2026-03-12 09:10:51 UTC


#7 Dynamic Bayesian regression quantile synthesis for forecasting outlook-at-risk [PDF] [Copy] [Kimi] [REL]

Authors: Genya Kobayashi, Shonosuke Sugasawa, Yuta Yamauchi, Dongu Han

This paper proposes dynamic Bayesian regression quantile synthesis (DRQS), a novel method for quantile forecasting within the Bayesian predictive synthesis (BPS) framework designed to combine quantile-specific information from multiple agent models. While existing BPS approaches primarily focus on mean forecasting, our method directly targets the conditional quantiles of the response variable by utilizing the asymmetric Laplace distribution for the synthesis function. The resulting framework can be interpreted as a dynamic quantile linear model with latent predictors. We extend the univariate DRQS to a multivariate setting-factor DRQS (FDRQS)-by introducing a time-varying latent factor structure for the synthesis weights. This allows the model to leverage cross-sectional dependencies and shared information across multiple time series simultaneously. We develop an efficient Markov chain Monte Carlo (MCMC) algorithm for posterior inference, utilizing data augmentation and forward-filtering backward-sampling. Empirical applications to US inflation and global GDP growth demonstrate the improved performance of the proposed methods for quantile forecasting. In particular, FDRQS exhibits superior resilience during periods of extreme economic stress, such as the COVID-19 pandemic, by adaptively rebalancing agent contributions and capturing emergent global dependencies.

Subjects: Methodology , Applications

Publish: 2026-03-12 02:53:53 UTC


#8 Multivariate Functional Principal Component Analysis for Mixed-Type mHealth Data: An Application to Mood Disorders [PDF] [Copy] [Kimi] [REL]

Authors: Debangan Dey, Rahul Ghosal, Kathleen Merikangas, Vadim Zipunnikov

Modern mobile health (mHealth) assessment combines self-reported measures of participants' health experiences with passively collected health behavior data throughout the day. These data are collected across multiple measurement scales, including continuous (physical activity), truncated (pain), ordinal (mood), and binary (daily life events). When indexed by time of day and stacked across assessment domains, these data structures can be treated as multivariate functional data comprising continuous, truncated, ordinal, and binary variables. Motivated by these applications, we propose a multivariate functional principal component analysis for mixed-type data ($M^2$FPCA). The approach is based on a semiparametric Gaussian copula model and assumes that the observed data arise from an underlying multivariate generalized latent nonparanormal functional process. Latent temporal and inter-variable dependence are estimated semiparametrically through Kendall's tau bridging method. Two covariance estimation procedures are developed: a fully multivariate block-wise estimator and a computationally efficient alternative based on partial separability that assumes shared principal components across domains. The proposed method yields interpretable latent functional principal component scores that can serve as participant-specific digital biomarkers. Simulation studies demonstrate the method's competitive performance under various complex dependence structures. The method is applied to mHealth data from 307 participants in the National Institute of Mental Health Family Study of Mood and Affective Spectrum Disorders. Our approach identifies time-of-day patterns shared across mood, anxiety, energy, and physical activity that meaningfully stratify mood disorder subtypes.

Subjects: Methodology , Applications

Publish: 2026-03-11 23:57:04 UTC


#9 Spatially Robust Inference with Predicted and Missing at Random Labels [PDF] [Copy] [Kimi] [REL]

Authors: Stephen Salerno, Zhenke Wu, Tyler McCormick

When outcome data are expensive or onerous to collect, scientists increasingly substitute predictions from machine learning and AI models for unlabeled cases, a process which has consequences for downstream statistical inference. While recent methods provide valid uncertainty quantification under independent sampling, real-world applications involve missing at random (MAR) labeling and spatial dependence. For inference in this setting, we propose a doubly robust estimator with cross-fit nuisances. We show that cross-fitting induces fold-level correlation that distorts spatial variance estimators, producing unstable or overly conservative confidence intervals. To address this, we propose a jackknife spatial heteroscedasticity and autocorrelation consistent (HAC) variance correction that separates spatial dependence from fold-induced noise. Under standard identification and dependence conditions, the resulting intervals are asymptotically valid. Simulations and benchmark datasets show substantial improvement in finite-sample calibration, particularly under MAR labeling and clustered sampling.

Subjects: Machine Learning , Machine Learning , Econometrics , Applications , Methodology

Publish: 2026-03-11 23:14:21 UTC


#10 Teleodynamic Learning a new Paradigm For Interpretable AI [PDF] [Copy] [Kimi] [REL]

Authors: Enrique ter Horst, Juan Diego Zambrano

We introduce Teleodynamic Learning, a new paradigm for machine learning in which learning is not the minimization of a fixed objective, but the emergence and stabilization of functional organization under constraint. Inspired by living systems, this framework treats intelligence as the coupled evolution of three quantities: what a system can represent, how it adapts its parameters, and which changes its internal resources can sustain. We formalize learning as a constrained dynamical process with two interacting timescales: inner dynamics for continuous parameter adaptation and outer dynamics for discrete structural change, linked by an endogenous resource variable that both shapes and is shaped by the trajectory. This perspective reveals three phenomena that standard optimization does not naturally capture: self-stabilization without externally imposed stopping rules, phase-structured learning dynamics that move from under-structuring through teleodynamic growth to over-structuring, and convergence guarantees grounded in information geometry rather than convexity. We instantiate the framework in the Distinction Engine (DE11), a teleodynamic learner grounded in Spencer-Brown's Laws of Form, information geometry, and tropical optimization. On standard benchmarks, DE11 achieves 93.3 percent test accuracy on IRIS, 92.6 percent on WINE, and 94.7 percent on Breast Cancer, while producing interpretable logical rules that arise endogenously from the learning dynamics rather than being imposed by hand. More broadly, Teleodynamic Learning unifies regularization, architecture search, and resource-bounded inference within a single principle: learning as the co-evolution of structure, parameters, and resources under constraint. This opens a thermodynamically grounded route to adaptive, interpretable, and self-organizing AI.

Subjects: Machine Learning , Applications

Publish: 2026-03-11 22:43:10 UTC


#11 Two Point Correlation Function Estimation with Contaminated Data [PDF1] [Copy] [Kimi] [REL]

Author: Arya Farahi

The two-point correlation function (2PCF) is a cornerstone of precision cosmology, yet its estimation from imaging surveys is vulnerable to contamination and incompleteness arising from imperfect target selection and pipeline-level inclusion decisions. In practice, the scientific target is a physically defined population, while the working catalog is constructed from noisy measurements and selection cuts, leading to mismatches between true and observed inclusion. These errors are often spatially structured, correlating with survey depth, observing conditions, and foregrounds, and can imprint spurious large-scale power or suppress the true clustering signal. High-resolution spectroscopic samples provide gold-standard inclusion in the target population but are typically available for only a small subset of objects. We introduce a prediction-powered Landy--Szalay (PP--LS) estimator that combines noisy inclusion labels across the full catalog with exact labels on a small spectroscopic subset while preserving the standard random-catalog normalization for survey geometry and selection. PP--LS debiases pair counts using residual-based, design-weighted corrections computed only on the labeled subset, requiring no probability calibration, known misclassification rates, or explicit modeling of contamination. Under simple random sampling of the labeled subset, we establish recovery of the oracle (true-label) Landy--Szalay pair counts and thus consistency for the target 2PCF. In simulations with clustered and spatially structured contaminants, PP--LS removes the bias of naive catalog-level estimators while achieving substantially lower variance than spectroscopic-only clustering. The resulting estimator is statistically principled, computationally lightweight, and integrates directly with standard pair-counting pipelines, enabling robust clustering inference in next-generation surveys.

Subjects: Instrumentation and Methods for Astrophysics , Cosmology and Nongalactic Astrophysics , Applications

Publish: 2026-03-11 20:22:48 UTC


#12 FlowSN: Normalising Flows for Simulation-Based Inference under Realistic Selection Effects applied to Supernova Cosmology [PDF] [Copy] [Kimi] [REL]

Authors: Benjamin M. Boyd, Kaisey S. Mandel, Matthew Grayling, Ayan Mitra, Richard Kessler, Maximilian Autenrieth, Aaron Do, Madeleine Ginolin, Lisa Kelsey, Gautham Narayan, Matthew O'Callaghan, Nikhil Sarin, Stephen Thorp

We present FlowSN, a statistical framework using simulation-based inference with normalising flows to account for selection effects in observational astronomy. Failure to account for selection effects can lead to biased inference on global parameters. An example is Malmquist bias, where detection limits result in a sample skewed towards brighter objects. In Type Ia supernova (SN Ia) cosmology, these selection effects can systematically shift the inferred posterior distributions of cosmological parameters, necessitating the development of robust statistical frameworks to account for the biases. Simulation-based inference enables us to implicitly learn probability distributions that are analytically intractable to calculate. In this work, we introduce a novel approach that employs a normalising flow to learn the non-analytic selected SN likelihood for a given survey from forward simulations, independent of the assumed cosmological model. The resulting likelihood approximation is incorporated into a hierarchical Bayesian framework and posterior sampling is performed using Hamiltonian Monte Carlo to obtain constraints on cosmological parameters conditioned on the observed data. The modular learnt likelihood approximation can be reused without retraining to evaluate different cosmological models, providing a key advantage over other simulation-based inference approaches. We demonstrate the performance of this methodology by training and testing the simulation-based inference technique using realistic LSST-like SNANA simulations for the first time. Our FlowSN approach yields accurate posterior estimates on cosmological parameters, including the dark energy equation of state $w_0$, that are an order of magnitude less biased than those obtained with conventional techniques and also exhibit improved frequentist calibration.

Subjects: Cosmology and Nongalactic Astrophysics , Applications

Publish: 2026-03-11 18:00:01 UTC