Loading [MathJax]/jax/element/mml/optable/GreekAndCoptic.js

Statistics

2025-04-03 | | Total: 42

#1 Comparison of Bayesian methods for extrapolation of treatment effects: a large scale simulation study [PDF] [Copy] [Kimi] [REL]

Authors: Tristan Fauvel, Julien Tanniou, Pascal Godbillot, Billy Amzal

Extrapolating treatment effects from related studies is a promising strategy for designing and analyzing clinical trials in situations where achieving an adequate sample size is challenging. Bayesian methods are well-suited for this purpose, as they enable the synthesis of prior information through the use of prior distributions. While the operating characteristics of Bayesian approaches for borrowing data from control arms have been extensively studied, methods that borrow treatment effects -- quantities derived from the comparison between two arms -- remain less well understood. In this paper, we present the findings of an extensive simulation study designed to address this gap. We evaluate the frequentist operating characteristics of these methods, including the probability of success, mean squared error, bias, precision, and credible interval coverage. Our results provide insights into the strengths and limitations of existing methods in the context of confirmatory trials. In particular, we show that the Conditional Power Prior and the Robust Mixture Prior perform better overall, while the test-then-pool variants and the p-value-based power prior display suboptimal performance.

Subjects: Methodology , Applications

Publish: 2025-04-02 17:55:11 UTC


#2 Estimating hazard rates from δ-records in discrete distributions [PDF] [Copy] [Kimi] [REL]

Authors: Martín Alcalde, Miguel Lafuente, F. Javier López, Lina Maldonado, Gerardo Sanz

This paper focuses on nonparametric statistical inference of the hazard rate function of discrete distributions based on \delta-record data. We derive the explicit expression of the maximum likelihood estimator and determine its exact distribution, as well as some important characteristics such as its bias and mean squared error. We then discuss the construction of confidence intervals and goodness-of-fit tests. The performance of our proposals is evaluated using simulation methods. Applications to real data are given, as well. The estimation of the hazard rate function based on usual records has been studied in the literature, although many procedures require several samples of records. In contrast, our approach relies on a single sequence of \delta-records, simplifying the experimental design and increasing the applicability of the methods.

Subject: Statistics Theory

Publish: 2025-04-02 15:43:19 UTC


#3 A New Approach to the Nonparametric Behrens-Fisher Problem with Compatible Confidence Intervals [PDF] [Copy] [Kimi] [REL]

Authors: Stephen Schüürhuis, Frank Konietschke, Edgar Brunner

We propose a new test to address the nonparametric Behrens-Fisher problem involving different distribution functions in the two samples. Our procedure tests the null hypothesis \mathcal{H}_0: \theta = \frac{1}{2}, where \theta = P(X<Y) + \frac{1}{2}P(X=Y) denotes the Mann-Whitney effect. No restrictions on the underlying distributions of the data are imposed with the trivial exception of one-point distributions. The method is based on evaluating the ratio of the variance \sigma_N^2 of the Mann-Whitney effect estimator \widehat{\theta} to its theoretical maximum, as derived from the Birnbaum-Klose inequality. Through simulations, we demonstrate that the proposed test effectively controls the type-I error rate under various conditions, including small sample sizes, unbalanced designs, and different data-generating mechanisms. Notably, it provides better control of the type-1 error rate compared to the widely used Brunner-Munzel test, particularly at small significance levels such as \alpha \in \{0.01, 0.005\}. Additionally, we derive range-preserving compatible confidence intervals, showing that they offer improved coverage over those compatible to the Brunner-Munzel test. Finally, we illustrate the application of our method in a clinical trial example.

Subject: Methodology

Publish: 2025-04-02 15:03:01 UTC


#4 Proper scoring rules for estimation and forecast evaluation [PDF] [Copy] [Kimi] [REL]

Authors: Kartik Waghmare, Johanna Ziegel

Proper scoring rules have been a subject of growing interest in recent years, not only as tools for evaluation of probabilistic forecasts but also as methods for estimating probability distributions. In this article, we review the mathematical foundations of proper scoring rules including general characterization results and important families of scoring rules. We discuss their role in statistics and machine learning for estimation and forecast evaluation. Furthermore, we comment on interesting developments of their usage in applications.

Subjects: Statistics Theory , Machine Learning

Publish: 2025-04-02 14:46:14 UTC


#5 Non-parametric Quantile Regression and Uniform Inference with Unknown Error Distribution [PDF] [Copy] [Kimi] [REL]

Authors: Haoze Hou, Wei Huang, Zheng Zhang

This paper studies the non-parametric estimation and uniform inference for the conditional quantile regression function (CQRF) with covariates exposed to measurement errors. We consider the case that the distribution of the measurement error is unknown and allowed to be either ordinary or super smooth. We estimate the density of the measurement error by the repeated measurements and propose the deconvolution kernel estimator for the CQRF. We derive the uniform Bahadur representation of the proposed estimator and construct the uniform confidence bands for the CQRF, uniformly in the sense for all covariates and a set of quantile indices, and establish the theoretical validity of the proposed inference. A data-driven approach for selecting the tuning parameter is also included. Monte Carlo simulations and a real data application demonstrate the usefulness of the proposed method.

Subjects: Methodology , Econometrics

Publish: 2025-04-02 14:16:39 UTC


#6 KD^{2}M: An unifying framework for feature knowledge distillation [PDF] [Copy] [Kimi] [REL]

Author: Eduardo Fernandes Montesuma

Knowledge Distillation (KD) seeks to transfer the knowledge of a teacher, towards a student neural net. This process is often done by matching the networks' predictions (i.e., their output), but, recently several works have proposed to match the distributions of neural nets' activations (i.e., their features), a process known as \emph{distribution matching}. In this paper, we propose an unifying framework, Knowledge Distillation through Distribution Matching (KD^{2}M), which formalizes this strategy. Our contributions are threefold. We i) provide an overview of distribution metrics used in distribution matching, ii) benchmark on computer vision datasets, and iii) derive new theoretical results for KD.

Subjects: Machine Learning , Machine Learning

Publish: 2025-04-02 14:14:46 UTC


#7 Segmentation variability and radiomics stability for predicting Triple-Negative Breast Cancer subtype using Magnetic Resonance Imaging [PDF] [Copy] [Kimi] [REL]

Authors: Isabella Cama, Alejandro Guzmán, Cristina Campi, Michele Piana, Karim Lekadir, Sara Garbarino, Oliver Díaz

Most papers caution against using predictive models for disease stratification based on unselected radiomic features, as these features are affected by contouring variability. Instead, they advocate for the use of the Intraclass Correlation Coefficient (ICC) as a measure of stability for feature selection. However, the direct effect of segmentation variability on the predictive models is rarely studied. This study investigates the impact of segmentation variability on feature stability and predictive performance in radiomics-based prediction of Triple-Negative Breast Cancer (TNBC) subtype using Magnetic Resonance Imaging. A total of 244 images from the Duke dataset were used, with segmentation variability introduced through modifications of manual segmentations. For each mask, explainable radiomic features were selected using the Shapley Additive exPlanations method and used to train logistic regression models. Feature stability across segmentations was assessed via ICC, Pearson's correlation, and reliability scores quantifying the relationship between feature stability and segmentation variability. Results indicate that segmentation accuracy does not significantly impact predictive performance. While incorporating peritumoral information may reduce feature reproducibility, it does not diminish feature predictive capability. Moreover, feature selection in predictive models is not inherently tied to feature stability with respect to segmentation, suggesting that an overreliance on ICC or reliability scores for feature selection might exclude valuable predictive features.

Subjects: Applications , Artificial Intelligence

Publish: 2025-04-02 12:48:01 UTC


#8 Sparse Gaussian Neural Processes [PDF] [Copy] [Kimi] [REL]

Authors: Tommy Rochussen, Vincent Fortuin

Despite significant recent advances in probabilistic meta-learning, it is common for practitioners to avoid using deep learning models due to a comparative lack of interpretability. Instead, many practitioners simply use non-meta-models such as Gaussian processes with interpretable priors, and conduct the tedious procedure of training their model from scratch for each task they encounter. While this is justifiable for tasks with a limited number of data points, the cubic computational cost of exact Gaussian process inference renders this prohibitive when each task has many observations. To remedy this, we introduce a family of models that meta-learn sparse Gaussian process inference. Not only does this enable rapid prediction on new tasks with sparse Gaussian processes, but since our models have clear interpretations as members of the neural process family, it also allows manual elicitation of priors in a neural process for the first time. In meta-learning regimes for which the number of observed tasks is small or for which expert domain knowledge is available, this offers a crucial advantage.

Subjects: Machine Learning , Machine Learning

Publish: 2025-04-02 12:00:09 UTC


#9 Density estimation via mixture discrepancy and moments [PDF] [Copy] [Kimi] [REL]

Authors: Zhengyang Lei, Sihong Shao

With the aim of generalizing histogram statistics to higher dimensional cases, density estimation via discrepancy based sequential partition (DSP) has been proposed [D. Li, K. Yang, W. Wong, Advances in Neural Information Processing Systems (2016) 1099-1107] to learn an adaptive piecewise constant approximation defined on a binary sequential partition of the underlying domain, where the star discrepancy is adopted to measure the uniformity of particle distribution. However, the calculation of the star discrepancy is NP-hard and it does not satisfy the reflection invariance and rotation invariance either. To this end, we use the mixture discrepancy and the comparison of moments as a replacement of the star discrepancy, leading to the density estimation via mixture discrepancy based sequential partition (DSP-mix) and density estimation via moments based sequential partition (MSP), respectively. Both DSP-mix and MSP are computationally tractable and exhibit the reflection and rotation invariance. Numerical experiments in reconstructing the d-D mixture of Gaussians and Betas with d=2, 3, \dots, 6 demonstrate that DSP-mix and MSP both run approximately ten times faster than DSP while maintaining the same accuracy.

Subjects: Machine Learning , Machine Learning , Computational Physics , Methodology

Publish: 2025-04-02 10:15:03 UTC


#10 Asymptotic analysis of the finite predictor for the fractional Gaussian noise [PDF] [Copy] [Kimi] [REL]

Authors: P. Chigansky, M. Kleptsyna

The goal of this paper is to propose a new approach to asymptotic analysis of the finite predictor for stationary sequences. It produces the exact asymptotics of the relative prediction error and the partial correlation coefficients. The assumptions are analytic in nature and applicable to processes with long range dependence. The ARIMA type process driven by the fractional Gaussian noise (fGn), which previously remained elusive, serves as our study case.

Subjects: Statistics Theory , Probability

Publish: 2025-04-02 10:03:53 UTC


#11 On Robust Empirical Likelihood for Nonparametric Regression with Application to Regression Discontinuity Designs [PDF] [Copy] [Kimi] [REL]

Authors: Qin Fang, Shaojun Guo, Yang Hong, Xinghao Qiao

Empirical likelihood serves as a powerful tool for constructing confidence intervals in nonparametric regression and regression discontinuity designs (RDD). The original empirical likelihood framework can be naturally extended to these settings using local linear smoothers, with Wilks' theorem holding only when an undersmoothed bandwidth is selected. However, the generalization of bias-corrected versions of empirical likelihood under more realistic conditions is non-trivial and has remained an open challenge in the literature. This paper provides a satisfactory solution by proposing a novel approach, referred to as robust empirical likelihood, designed for nonparametric regression and RDD. The core idea is to construct robust weights which simultaneously achieve bias correction and account for the additional variability introduced by the estimated bias, thereby enabling valid confidence interval construction without extra estimation steps involved. We demonstrate that the Wilks' phenomenon still holds under weaker conditions in nonparametric regression, sharp and fuzzy RDD settings. Extensive simulation studies confirm the effectiveness of our proposed approach, showing superior performance over existing methods in terms of coverage probabilities and interval lengths. Moreover, the proposed procedure exhibits robustness to bandwidth selection, making it a flexible and reliable tool for empirical analyses. The practical usefulness is further illustrated through applications to two real datasets.

Subjects: Statistics Theory , Econometrics , Methodology

Publish: 2025-04-02 09:22:18 UTC


#12 Predicting passenger injury distributions under uncertainty variables using Gaussian process modeling with GHBMC [PDF] [Copy] [Kimi] [REL]

Authors: Changmin Baek, Junik Cho, Dongjin Lee

This work presents a Gaussian Process (GP) modeling method to predict statistical characteristics of injury kinematics responses using Human Body Models (HBM) more accurately and efficiently. We validate the GHBMC model against a 50\%tile male Post-Mortem Human Surrogate (PMHS) test. Using this validated model, we create various postured models and generate injury prediction data across different postures and personalized D-ring heights through parametric crash simulations. We then train the GP using this simulation data, implementing a novel adaptive sampling approach to improve accuracy. The trained GP model demonstrates robustness by achieving target prediction accuracy at points with high uncertainty. The proposed method performs continuous injury prediction for various crash scenarios using just 27 computationally expensive simulation runs. This method can be effectively applied to designing highly reliable occupant restraint systems across diverse crash conditions.

Subject: Applications

Publish: 2025-04-02 09:17:39 UTC


#13 On the limitations for causal inference in Cox models with time-varying treatment [PDF] [Copy] [Kimi] [REL]

Authors: Mark B. Knudsen, Erin E. Gabriel, Torben Martinussen, Helene C. W. Rytgaard, Arvid Sjölander

When using the Cox model to analyze the effect of a time-varying treatment on a survival outcome, treatment is commonly included, using only the current level as a time-dependent covariate. Such a model does not necessarily assume that past treatment is not associated with the outcome (the Markov property), since it is possible to model the hazard conditional on only the current treatment value. However, modeling the hazard conditional on the full treatment history is required in order to interpret the results causally, and such a full model assumes the Markov property when only including current treatment. This is, for example, common in marginal structural Cox models. We demonstrate that relying on the Markov property is problematic, since it only holds in unrealistic settings or if the treatment has no causal effect. This is the case even if there are no confounders and the true causal effect of treatment really only depends on its current level. Further, we provide an example of a scenario where the Markov property is not fulfilled, but the Cox model that includes only current treatment as a covariate is correctly specified. Transforming the result to the survival scale does not give the true intervention-specific survival probabilities, showcasing that it is unclear how to make causal statements from such models.

Subject: Methodology

Publish: 2025-04-02 09:10:48 UTC


#14 Time-to-event prediction for grouped variables using Exclusive Lasso [PDF] [Copy] [Kimi] [REL]

Authors: Dayasri Ravi, Andreas Groll

The integration of high-dimensional genomic data and clinical data into time-to-event prediction models has gained significant attention due to the growing availability of these datasets. Traditionally, a Cox regression model is employed, concatenating various covariate types linearly. Given that much of the data may be redundant or irrelevant, feature selection through penalization is often desirable. A notable characteristic of these datasets is their organization into blocks of distinct data types, such as methylation and clinical predictors, which requires selecting a subset of covariates from each group due to high intra-group correlations. For this reason, we propose utilizing Exclusive Lasso regularization in place of standard Lasso penalization. We apply our methodology to a real-life cancer dataset, demonstrating enhanced survival prediction performance compared to the conventional Cox regression model.

Subjects: Methodology , Computation , Machine Learning

Publish: 2025-04-02 09:07:05 UTC


#15 Adaptive adequacy testing of high-dimensional factor-augmented regression model [PDF] [Copy] [Kimi] [REL]

Authors: Yanmei Shi, Leheng Cai, Xu Guo, Shu rong Zheng

In this paper, we investigate the adequacy testing problem of high-dimensional factor-augmented regression model. Existing test procedures perform not well under dense alternatives. To address this critical issue, we introduce a novel quadratic-type test statistic which can efficiently detect dense alternative hypotheses. We further propose an adaptive test procedure to remain powerful under both sparse and dense alternative hypotheses. Theoretically, under the null hypothesis, we establish the asymptotic normality of the proposed quadratic-type test statistic and asymptotic independence of the newly introduced quadratic-type test statistic and a maximum-type test statistic. We also prove that our adaptive test procedure is powerful to detect signals under either sparse or dense alternative hypotheses. Simulation studies and an application to an FRED-MD macroeconomics dataset are carried out to illustrate the merits of our introduced procedures.

Subject: Methodology

Publish: 2025-04-02 07:34:23 UTC


#16 Evaluating probabilities without model risk [PDF] [Copy] [Kimi] [REL]

Authors: Joan del Castillo, Pedro Puig

This article presents methods for estimating extreme probabilities, beyond the range of the observations. These methods are model-free and applicable to almost any sample size. They are grounded in order statistics theory and have a wide range of applications, as they simply require the assumption of a finite expectation. Even in cases when a particular risk model exists, the new methods provide clarity, security and simplicity. The methodology is applicable to the behavior of financial markets, and the results may be compared to those provided by extreme value theory.

Subject: Applications

Publish: 2025-04-02 06:10:57 UTC


#17 A Practical Guide to Estimating Conditional Marginal Effects: Modern Approaches [PDF] [Copy] [Kimi] [REL]

Authors: Jiehan Liu, Ziyi Liu, Yiqing Xu

This Element offers a practical guide to estimating conditional marginal effects-how treatment effects vary with a moderating variable-using modern statistical methods. Commonly used approaches, such as linear interaction models, often suffer from unclarified estimands, limited overlap, and restrictive functional forms. This guide begins by clearly defining the estimand and presenting the main identification results. It then reviews and improves upon existing solutions, such as the semiparametric kernel estimator, and introduces robust estimation strategies, including augmented inverse propensity score weighting with Lasso selection (AIPW-Lasso) and double machine learning (DML) with modern algorithms. Each method is evaluated through simulations and empirical examples, with practical recommendations tailored to sample size and research context. All tools are implemented in the accompanying interflex package for R.

Subjects: Methodology , Econometrics

Publish: 2025-04-02 05:00:14 UTC


#18 Tail Bounds for Canonical U-Statistics and U-Processes with Unbounded Kernels [PDF] [Copy] [Kimi] [REL]

Authors: Abhishek Chakrabortty, Arun K. Kuchibhotla

In this paper, we prove exponential tail bounds for canonical (or degenerate) U-statistics and U-processes under exponential-type tail assumptions on the kernels. Most of the existing results in the relevant literature often assume bounded kernels or obtain sub-optimal tail behavior under unbounded kernels. We obtain sharp rates and optimal tail behavior under sub-Weibull kernel functions. Some examples from nonparametric and semiparametric statistics literature are considered.

Subjects: Statistics Theory , Probability

Publish: 2025-04-02 03:05:28 UTC


#19 Analyzing Functional Data with a Mixture of Covariance Structures Using a Curved-Based Sampling Scheme [PDF] [Copy] [Kimi] [REL]

Authors: Yian Yu, Bo Wang, Jian Qing Shi

Motivated by distinct walking patterns in real-world free-living gait data, this paper proposes an innovative curve-based sampling scheme for the analysis of functional data characterized by a mixture of covariance structures. Traditional approaches often fail to adequately capture inherent complexities arising from heterogeneous covariance patterns across distinct subsets of the data. We introduce a unified Bayesian framework that integrates a nonlinear regression function with a continuous-time hidden Markov model, enabling the identification and utilization of varying covariance structures. One of the key contributions is the development of a computationally efficient curve-based sampling scheme for hidden state estimation, addressing the sampling complexities associated with high-dimensional, conditionally dependent data. This paper details the Bayesian inference procedure, examines the asymptotic properties to ensure the structural consistency of the model, and demonstrates its effectiveness through simulated and real-world examples.

Subject: Methodology

Publish: 2025-04-02 02:55:35 UTC


#20 SELIC: Semantic-Enhanced Learned Image Compression via High-Level Textual Guidance [PDF] [Copy] [Kimi] [REL]

Authors: Haisheng Fu, Jie Liang, Zhenman Fang, Jingning Han

Learned image compression (LIC) techniques have achieved remarkable progress; however, effectively integrating high-level semantic information remains challenging. In this work, we present a \underline{S}emantic-\underline{E}nhanced \underline{L}earned \underline{I}mage \underline{C}ompression framework, termed \textbf{SELIC}, which leverages high-level textual guidance to improve rate-distortion performance. Specifically, \textbf{SELIC} employs a text encoder to extract rich semantic descriptions from the input image. These textual features are transformed into fixed-dimension tensors and seamlessly fused with the image-derived latent representation. By embedding the \textbf{SELIC} tensor directly into the compression pipeline, our approach enriches the bitstream without requiring additional inputs at the decoder, thereby maintaining fast and efficient decoding. Extensive experiments on benchmark datasets (e.g., Kodak) demonstrate that integrating semantic information substantially enhances compression quality. Our \textbf{SELIC}-guided method outperforms a baseline LIC model without semantic integration by approximately 0.1-0.15 dB across a wide range of bit rates in PSNR and achieves a 4.9\% BD-rate improvement over VVC. Moreover, this improvement comes with minimal computational overhead, making the proposed \textbf{SELIC} framework a practical solution for advanced image compression applications.

Subject: Applications

Publish: 2025-04-02 01:07:34 UTC


#21 Online Fault Detection and Classification of Chemical Process Systems Leveraging Statistical Process Control and Riemannian Geometric Analysis [PDF] [Copy] [Kimi] [REL]

Authors: Alireza Miraliakbar, Fangyuan Ma, Zheyu Jiang

In this work, we study an integrated fault detection and classification framework called FARM for fast, accurate, and robust online chemical process monitoring. The FARM framework integrates the latest advancements in statistical process control (SPC) for monitoring nonparametric and heterogeneous data streams with novel data analysis approaches based on Riemannian geometry together in a hierarchical framework for online process monitoring. We conduct a systematic evaluation of the FARM monitoring framework using the Tennessee Eastman Process (TEP) dataset. Results show that FARM performs competitively against state-of-the-art process monitoring algorithms by achieving a good balance among fault detection rate (FDR), fault detection speed (FDS), and false alarm rate (FAR). Specifically, FARM achieved an average FDR of 96.97% while also outperforming benchmark methods in successfully detecting hard-to-detect faults that are previously known, including Faults 3, 9 and 15, with FDRs being 97.08%, 96.30% and 95.99%, respectively. In terms of FAR, our FARM framework allows practitioners to customize their choice of FAR, thereby offering great flexibility. Moreover, we report a significant improvement in average fault classification accuracy during online monitoring from 61% to 82% when leveraging Riemannian geometric analysis, and further to 84.5% when incorporating additional features from SPC. This illustrates the synergistic effect of integrating fault detection and classification in a holistic, hierarchical monitoring framework.

Subject: Other Statistics

Publish: 2025-04-02 01:00:36 UTC


#22 On spectral gap decomposition for Markov chains [PDF] [Copy] [Kimi] [REL]

Author: Qian Qin

Multiple works regarding convergence analysis of Markov chains have led to spectral gap decomposition formulas of the form \mathrm{Gap}(S) \geq c_0 \left[\inf_z \mathrm{Gap}(Q_z)\right] \mathrm{Gap}(\bar{S}), where c_0 is a constant, \mathrm{Gap} denotes the right spectral gap of a reversible Markov operator, S is the Markov transition kernel (Mtk) of interest, \bar{S} is an idealized or simplified version of S, and \{Q_z\} is a collection of Mtks characterizing the differences between S and \bar{S}. This type of relationship has been established in various contexts, including: 1. decomposition of Markov chains based on a finite cover of the state space, 2. hybrid Gibbs samplers, and 3. spectral independence and localization schemes. We show that multiple key decomposition results across these domains can be connected within a unified framework, rooted in a simple sandwich structure of S. Within the general framework, we establish new instances of spectral gap decomposition for hybrid hit-and-run samplers and hybrid data augmentation algorithms with two intractable conditional distributions. Additionally, we explore several other properties of the sandwich structure, and derive extensions of the spectral gap decomposition formula.

Subjects: Statistics Theory , Probability

Publish: 2025-04-01 23:23:34 UTC


#23 Stock Return Prediction based on a Functional Capital Asset Pricing Model [PDF] [Copy] [Kimi] [REL]

Authors: Ufuk Beyaztas, Kaiying Ji, Han Lin Shang, Eliza Wu

The capital asset pricing model (CAPM) is readily used to capture a linear relationship between the daily returns of an asset and a market index. We extend this model to an intraday high-frequency setting by proposing a functional CAPM estimation approach. The functional CAPM is a stylized example of a function-on-function linear regression with a bivariate functional regression coefficient. The two-dimensional regression coefficient measures the cross-covariance between cumulative intraday asset returns and market returns. We apply it to the Standard and Poor's 500 index and its constituent stocks to demonstrate its practicality. We investigate the functional CAPM's in-sample goodness-of-fit and out-of-sample prediction for an asset's cumulative intraday return. The findings suggest that the proposed functional CAPM methods have superior model goodness-of-fit and forecast accuracy compared to the traditional CAPM empirical estimation. In particular, the functional methods produce better model goodness-of-fit and prediction accuracy for stocks traditionally considered less price-efficient or more information-opaque.

Subjects: Methodology , Applications

Publish: 2025-04-01 23:04:50 UTC


#24 Bivariate Simplex Distribution [PDF] [Copy] [Kimi] [REL]

Authors: Emerson Amaral, Lucas S. Vieira, Lizandra C. Fabio, Vanessa Barros, Jalmar M. F. Carrasco

This article proposes a bivariate Simplex distribution for modeling continuous outcomes constrained to the interval (0,1), which can represent proportions, rates, or indices. We derive analytical expressions to calculate the dependence between the variables and examine its relationship with the association parameter. Parameters are estimated using the maximum likelihood method, and their performance is assessed through Monte Carlo simulations. The simulations explore various aspects of the bivariate distribution, including different surfaces and contour graphs. To illustrate the proposed model's methodology and properties, we present an application in the Jurimetric area. A user-friendly package, BSimplex, is also available in the R software.

Subject: Methodology

Publish: 2025-04-01 21:48:33 UTC


#25 Conformal Anomaly Detection for Functional Data with Elastic Distance Metrics [PDF] [Copy] [Kimi] [REL]

Authors: Jason Adams, Brandon Berman, Joshua Michalenko, J. Derek Tucker

This paper considers the problem of outlier detection in functional data analysis focusing particularly on the more difficult case of shape outliers. We present an inductive conformal anomaly detection method based on elastic functional distance metrics. This method is evaluated and compared to similar conformal anomaly detection methods for functional data using simulation experiments. The method is also used in the analysis of two real exemplar data sets that show its utility in practical applications. The results demonstrate the efficacy of the proposed method for detecting both magnitude and shape outliers in two distinct outlier detection scenarios.

Subject: Methodology

Publish: 2025-04-01 20:28:29 UTC