Methodology

2024-10-22 | | Total: 34

#1 Dynamic Time Warping-based imputation of long gaps in human mobility trajectories [PDF] [Copy] [Kimi] [REL]

Authors: Danielle McCool ; Peter Lugtig ; Barry Schouten

Individual mobility trajectories are difficult to measure and often incur long periods of missingness. Aggregation of this mobility data without accounting for the missingness leads to erroneous results, underestimating travel behavior. This paper proposes Dynamic Time Warping-Based Multiple Imputation (DTWBMI) as a method of filling long gaps in human mobility trajectories in order to use the available data to the fullest extent. This method reduces spatiotemporal trajectories to time series of particular travel behavior, then selects candidates for multiple imputation on the basis of the dynamic time warping distance between the potential donor series and the series preceding and following the gap in the recipient series and finally imputes values multiple times. A simulation study designed to establish optimal parameters for DTWBMI provides two versions of the method. These two methods are applied to a real-world dataset of individual mobility trajectories with simulated missingness and compared against other methods of handling missingness. Linear interpolation outperforms DTWBMI and other methods when gaps are short and data are limited. DTWBMI outperforms other methods when gaps become longer and when more data are available.

Subject: Methodology

Publish: 2024-10-21 15:21:51 UTC

#2 Improving the (approximate) sequential probability ratio test by avoiding overshoot [PDF] [Copy] [Kimi] [REL]

Authors: Lasse Fischer ; Aaditya Ramdas

The sequential probability ratio test (SPRT) by Wald (1945) is a cornerstone of sequential analysis. Based on desired type-I, II error levels $\alpha, \beta \in (0,1)$, it stops when the likelihood ratio statistic crosses certain upper and lower thresholds, guaranteeing optimality of the expected sample size. However, these thresholds are not closed form and the test is often applied with approximate thresholds $(1-\beta)/\alpha$ and $\beta/(1-\alpha)$ (approximate SPRT). When $\beta > 0$, this neither guarantees type I,II error control at $\alpha,\beta$ nor optimality. When $\beta=0$ (power-one SPRT), it guarantees type I error control at $\alpha$ that is in general conservative, and thus not optimal. The looseness in both cases is caused by overshoot: the test statistic overshoots the thresholds at the stopping time. One standard way to address this is to calculate the right thresholds numerically, but many papers and software packages do not do this. In this paper, we describe a different way to improve the approximate SPRT: we change the test statistic to avoid overshoot. Our technique uniformly improves power-one SPRTs $(\beta=0)$ for simple nulls and alternatives, or for one-sided nulls and alternatives in exponential families. When $\beta > 0$, our techniques provide valid type I error guarantees, lead to similar type II error as Wald's, but often needs less samples. These improved sequential tests can also be used for deriving tighter parametric confidence sequences, and can be extended to nontrivial settings like sampling without replacement and conformal martingales.

Subject: Methodology

Publish: 2024-10-21 14:54:49 UTC

#3 A Causal Transformation Model for Time-to-Event Data Affected by Unobserved Confounding: Revisiting the Illinois Reemployment Bonus Experiment [PDF] [Copy] [Kimi] [REL]

Authors: Giampiero Marra ; Rosalba Radice

Motivated by studies investigating causal effects in survival analysis, we propose a transformation model to quantify the impact of a binary treatment on a time-to-event outcome. The approach is based on a flexible linear transformation structural model that links a monotone function of the time-to-event with the propensity for treatment through a bivariate Gaussian distribution. The model equations are specified as functions of additive predictors, allowing the impacts of observed confounders to be accounted for flexibly. Furthermore, the effect of the instrumental variable may be regularized through a ridge penalty, while interactions between the treatment and modifier variables can be incorporated into the model to acknowledge potential variations in treatment effects across different subgroups. The baseline survival function is estimated in a flexible manner using monotonic P-splines, while unobserved confounding is captured through the dependence parameter of the bivariate Gaussian. Parameter estimation is achieved via a computationally efficient and stable penalized maximum likelihood estimation approach and intervals constructed using the related inferential results. We revisit a dataset from the Illinois Reemployment Bonus Experiment to estimate the causal effect of a cash bonus on unemployment duration, unveiling new insights. The modeling framework is incorporated into the R package GJRM, enabling researchers and practitioners to fit the proposed causal survival model and obtain easy-to-interpret numerical and visual summaries.

Subject: Methodology

Publish: 2024-10-21 12:54:57 UTC

#4 A measure of departure from symmetry via the Fisher-Rao distance for contingency tables [PDF] [Copy] [Kimi] [REL]

Authors: Wataru Urasaki ; Go Kawamitsu ; Tomoyuki Nakagawa ; Kouji Tahata

A measure of asymmetry is a quantification method that allows for the comparison of categorical evaluations before and after treatment effects or among different target populations, irrespective of sample size. We focus on square contingency tables that summarize survey results between two time points or cohorts, represented by the same categorical variables. We propose a measure to evaluate the degree of departure from a symmetry model using cosine similarity. This proposal is based on the Fisher-Rao distance, allowing asymmetry to be interpreted as a geodesic distance between two distributions. Various measures of asymmetry have been proposed, but visualizing the relationship of these quantification methods on a two-dimensional plane demonstrates that the proposed measure provides the geometrically simplest and most natural quantification. Moreover, the visualized figure indicates that the proposed method for measuring departures from symmetry is less affected by very few cells with extreme asymmetry. A simulation study shows that for square contingency tables with an underlying asymmetry model, our method can directly extract and quantify only the asymmetric structure of the model, and can more sensitively detect departures from symmetry than divergence-type measures.

Subject: Methodology

Publish: 2024-10-21 10:57:18 UTC

#5 Nonparametric method of structural break detection in stochastic time series regression model [PDF] [Copy] [Kimi] [REL]

Authors: Archi Roy ; Moumanti Podder ; Soudeep Deb

We propose a nonparametric algorithm to detect structural breaks in the conditional mean and/or variance of a time series. Our method does not assume any specific parametric form for the dependence structure of the regressor, the time series model, or the distribution of the model noise. This flexibility allows our algorithm to be applicable to a wide range of time series structures commonly encountered in financial econometrics. The effectiveness of the proposed algorithm is validated through an extensive simulation study and a real data application in detecting structural breaks in the mean and volatility of Bitcoin returns. The algorithm's ability to identify structural breaks in the data highlights its practical utility in econometric analysis and financial modeling.

Subject: Methodology

Publish: 2024-10-21 07:32:55 UTC

#6 Variable screening for covariate dependent extreme value index estimation [PDF] [Copy] [Kimi] [REL]

Authors: Takuma Yoshida ; Yuta Umezu

One of the main topics of extreme value analysis is to estimate the extreme value index, an important parameter that controls the tail behavior of the distribution. In many cases, estimating the extreme value index of the target variable associated with covariates is useful. Although the estimation of the covariate-dependent extreme value index has been developed by numerous researchers, no results have been presented regarding covariate selection. This paper proposes a sure independence screening method for covariate-dependent extreme value index estimation. For the screening, the marginal utility between the target variable and each covariate is calculated using the conditional Pickands estimator. A single-index model that uses the covariates selected by screening is further provided to estimate the extreme value index after screening. Monte Carlo simulations confirmed the finite sample performance of the proposed method. In addition, a real-data application is presented.

Subject: Methodology

Publish: 2024-10-21 07:21:29 UTC

#7 Assessing mediation in cross-sectional stepped wedge cluster randomized trials [PDF] [Copy] [Kimi] [REL]

Authors: Zhiqiang Cao ; Fan Li

Mediation analysis has been comprehensively studied for independent data but relatively little work has been done for correlated data, especially for the increasingly adopted stepped wedge cluster randomized trials (SW-CRTs). Motivated by challenges in underlying the effect mechanisms in pragmatic and implementation science clinical trials, we develop new methods for mediation analysis in SW-CRTs. Specifically, based on a linear and generalized linear mixed models, we demonstrate how to estimate the natural indirect effect and mediation proportion in typical SW-CRTs with four data types, including both continuous and binary mediators and outcomes. Furthermore, to address the emerging challenges in exposure-time treatment effect heterogeneity, we derive the mediation expressions in SW-CRTs when the total effect varies as a function of the exposure time. The cluster jackknife approach is considered for inference across all data types and treatment effect structures. We conduct extensive simulations to evaluate the finite-sample performances of proposed mediation estimators and demonstrate the proposed approach in a real data example. A user-friendly R package mediateSWCRT has been developed to facilitate the practical implementation of the estimators.

Subject: Methodology

Publish: 2024-10-21 02:43:07 UTC

#8 Ablation Studies for Novel Treatment Effect Estimation Models [PDF] [Copy] [Kimi] [REL]

Authors: Hugo Gobato Souto ; Francisco Louzada

Ablation studies are essential for understanding the contribution of individual components within complex models, yet their application in nonparametric treatment effect estimation remains limited. This paper emphasizes the importance of ablation studies by examining the Bayesian Causal Forest (BCF) model, particularly the inclusion of the estimated propensity score $\hat{\pi}(x_i)$ intended to mitigate regularization-induced confounding (RIC). Through a partial ablation study utilizing five synthetic data-generating processes with varying baseline and propensity score complexities, we demonstrate that excluding $\hat{\pi}(x_i)$ does not diminish the model's performance in estimating average and conditional average treatment effects or in uncertainty quantification. Moreover, omitting $\hat{\pi}(x_i)$ reduces computational time by approximately 21\%. These findings suggest that the BCF model's inherent flexibility suffices in adjusting for confounding without explicitly incorporating the propensity score. The study advocates for the routine use of ablation studies in treatment effect estimation to ensure model components are essential and to prevent unnecessary complexity.

Subjects: Methodology ; Machine Learning

Publish: 2024-10-21 01:05:24 UTC

#9 Simultaneous Inference in Multiple Matrix-Variate Graphs for High-Dimensional Neural Recordings [PDF] [Copy] [Kimi] [REL]

Authors: Zongge Liu ; Heejong Bong ; Zhao Ren ; Matthew A. Smith ; Robert E. Kass

As large-scale neural recordings become common, many neuroscientific investigations are focused on identifying functional connectivity from spatio-temporal measurements in two or more brain areas across multiple sessions. Spatial-temporal data in neural recordings can be represented as matrix-variate data, with time as the first dimension and space as the second. In this paper, we exploit the multiple matrix-variate Gaussian Graphical model to encode the common underlying spatial functional connectivity across multiple sessions of neural recordings. By effectively integrating information across multiple graphs, we develop a novel inferential framework that allows simultaneous testing to detect meaningful connectivity for a target edge subset of arbitrary size. Our test statistics are based on a group penalized regression approach and a high-dimensional Gaussian approximation technique. The validity of simultaneous testing is demonstrated theoretically under mild assumptions on sample size and non-stationary autoregressive temporal dependence. Our test is nearly optimal in achieving the testable region boundary. Additionally, our method involves only convex optimization and parametric bootstrap, making it computationally attractive. We demonstrate the efficacy of the new method through both simulations and an experimental study involving multiple local field potential (LFP) recordings in the Prefrontal Cortex (PFC) and visual area V4 during a memory-guided saccade task.

Subjects: Methodology ; Statistics Theory

Publish: 2024-10-20 22:50:02 UTC

#10 Randomization Inference for Before-and-After Studies with Multiple Units: An Application to a Criminal Procedure Reform in Uruguay [PDF] [Copy] [Kimi] [REL]

Authors: Matias D. Cattaneo ; Carlos Diaz ; Rocio Titiunik

We study the immediate impact of a new code of criminal procedure on crime. In November 2017, Uruguay switched from an inquisitorial system (where a single judge leads the investigation and decides the appropriate punishment for a particular crime) to an adversarial system (where the investigation is now led by prosecutors and the judge plays an overseeing role). To analyze the short-term effects of this reform, we develop a randomization-based approach for before-and-after studies with multiple units. Our framework avoids parametric time series assumptions and eliminates extrapolation by basing statistical inferences on finite-sample methods that rely only on the time periods closest to the time of the policy intervention. A key identification assumption underlying our method is that there would have been no time trends in the absence of the intervention, which is most plausible in a small window around the time of the reform. We also discuss several falsification methods to assess the plausibility of this assumption. Using our proposed inferential approach, we find statistically significant short-term causal effects of the crime reform. Our unbiased estimate shows an average increase of approximately 25 police reports per day in the week following the implementation of the new adversarial system in Montevideo, representing an 8 percent increase compared to the previous week under the old system.

Subjects: Methodology ; Applications

Publish: 2024-10-20 19:28:19 UTC

#11 A New Framework for Bayesian Function Registration [PDF] [Copy] [Kimi] [REL]

Authors: Yijia Ma ; Wei Wu

Function registration, also referred to as alignment, has been one of the fundamental problems in the field of functional data analysis. Classical registration methods such as the Fisher-Rao alignment focus on estimating optimal time warping function between functions. In recent studies, a model on time warping has attracted more attention, and it can be used as a prior term to combine with the classical method (as a likelihood term) in a Bayesian framework. The Bayesian approaches have been shown improvement over the classical methods. However, its prior model on time warping is often based a nonlinear approximation, which may introduce inaccuracy and inefficiency. To overcome these problems, we propose a new Bayesian approach by adopting a prior which provides a linear representation and various stochastic processes (Gaussian or non-Gaussian) can be effectively utilized on time warping. No linearization approximation is needed in the time warping computation, and the posterior can be obtained via a conventional Markov Chain Monte Carlo approach. We thoroughly investigate the impact of the prior on the performance of functional registration with multiple simulation examples, which demonstrate the superiority of the new framework over the previous methods. We finally utilize the new method in a real dataset and obtain desirable alignment result.

Subject: Methodology

Publish: 2024-10-20 15:38:02 UTC

#12 Probabilities for asymmetric p-outside values [PDF] [Copy] [Kimi] [REL]

Author: Pavlina K. Jordanova

In 2017-2020 Jordanova and co-authors investigate probabilities for p-outside values and determine them in many particular cases. They show that these probabilities are closely related to the concept for heavy tails. Tukey's boxplots are very popular and useful in practice. Analogously to the chi-square-criterion, the relative frequencies of the events an observation to fall in different their parts, compared with the corresponding probabilities an observation of a fixed probability distribution to fall in the same parts, help the practitioners to find the accurate probability distribution of the observed random variable. These open the door to work with the distribution sensitive estimators which in many cases are more accurate, especially for small sample investigations. All these methods, however, suffer from the disadvantage that they use inter quantile range in a symmetric way. The concept for outside values should take into account the form of the distribution. Therefore, here, we give possibility for more asymmetry in analysis of the tails of the distributions. We suggest new theoretical and empirical box-plots and characteristics of the tails of the distributions. These are theoretical asymmetric p-outside values functions. We partially investigate some of their properties and give some examples. It turns out that they do not depend on the center and the scaling factor of the distribution. Therefore, they are very appropriate for comparison of the tails of the distribution, and later on, for estimation of the parameters, which govern the tail behaviour of the cumulative distribution function.

Subject: Methodology

Publish: 2024-10-20 12:49:29 UTC

#13 High-dimensional prediction for count response via sparse exponential weights [PDF] [Copy] [Kimi] [REL]

Author: The Tien Mai

Count data is prevalent in various fields like ecology, medical research, and genomics. In high-dimensional settings, where the number of features exceeds the sample size, feature selection becomes essential. While frequentist methods like Lasso have advanced in handling high-dimensional count data, Bayesian approaches remain under-explored with no theoretical results on prediction performance. This paper introduces a novel probabilistic machine learning framework for high-dimensional count data prediction. We propose a pseudo-Bayesian method that integrates a scaled Student prior to promote sparsity and uses an exponential weight aggregation procedure. A key contribution is a novel risk measure tailored to count data prediction, with theoretical guarantees for prediction risk using PAC-Bayesian bounds. Our results include non-asymptotic oracle inequalities, demonstrating rate-optimal prediction error without prior knowledge of sparsity. We implement this approach efficiently using Langevin Monte Carlo method. Simulations and a real data application highlight the strong performance of our method compared to the Lasso in various settings.

Subjects: Methodology ; Statistics Theory ; Machine Learning

Publish: 2024-10-20 12:45:42 UTC

#14 Bayesian-based Propensity Score Subclassification Estimator [PDF] [Copy] [Kimi] [REL]

Authors: Shunichiro Orihara ; Tomotaka Momozaki

Subclassification estimators are one of the methods used to estimate causal effects of interest using the propensity score. This method is more stable compared to other weighting methods, such as inverse probability weighting estimators, in terms of the variance of the estimators. In subclassification estimators, the number of strata is traditionally set at five, and this number is not typically chosen based on data information. Even when the number of strata is selected, the uncertainty from the selection process is often not properly accounted for. In this study, we propose a novel Bayesian-based subclassification estimator that can assess the uncertainty in the number of strata, rather than selecting a single optimal number, using a Bayesian paradigm. To achieve this, we apply a general Bayesian procedure that does not rely on a likelihood function. This procedure allows us to avoid making strong assumptions about the outcome model, maintaining the same flexibility as traditional causal inference methods. With the proposed Bayesian procedure, it is expected that uncertainties from the design phase can be appropriately reflected in the analysis phase, which is sometimes overlooked in non-Bayesian contexts.

Subject: Methodology

Publish: 2024-10-19 13:31:51 UTC

#15 Stochastic Loss Reserving: Dependence and Estimation [PDF] [Copy] [Kimi] [REL]

Authors: Andrew Fleck ; Edward Furman ; Yang Shen

Nowadays insurers have to account for potentially complex dependence between risks. In the field of loss reserving, there are many parametric and non-parametric models attempting to capture dependence between business lines. One common approach has been to use additive background risk models (ABRMs) which provide rich and interpretable dependence structures via a common shock model. Unfortunately, ABRMs are often restrictive. Models that capture necessary features may have impractical to estimate parameters. For example models without a closed-form likelihood function for lack of a probability density function (e.g. some Tweedie, Stable Distributions, etc). We apply a modification of the continuous generalised method of moments (CGMM) of [Carrasco and Florens, 2000] which delivers comparable estimators to the MLE to loss reserving. We examine models such as the one proposed by [Avanzi et al., 2016] and a related but novel one derived from the stable family of distributions. Our CGMM method of estimation provides conventional non-Bayesian estimates in the case where MLEs are impractical.

Subjects: Methodology ; Risk Management ; Applications

Publish: 2024-10-19 05:24:11 UTC

#16 Fast and Optimal Changepoint Detection and Localization using Bonferroni Triplets [PDF] [Copy] [Kimi] [REL]

Authors: Jayoon Jang ; Guenther Walther

The paper considers the problem of detecting and localizing changepoints in a sequence of independent observations. We propose to evaluate a local test statistic on a triplet of time points, for each such triplet in a particular collection. This collection is sparse enough so that the results of the local tests can simply be combined with a weighted Bonferroni correction. This results in a simple and fast method, {\sl Lean Bonferroni Changepoint detection} (LBD), that provides finite sample guarantees for the existance of changepoints as well as simultaneous confidence intervals for their locations. LBD is free of tuning parameters, and we show that LBD allows optimal inference for the detection of changepoints. To this end, we provide a lower bound for the critical constant that measures the difficulty of the changepoint detection problem, and we show that LBD attains this critical constant. We illustrate LBD for a number of distributional settings, namely when the observations are homoscedastic normal with known or unknown variance, for observations from a natural exponential family, and in a nonparametric setting where we assume only exchangeability for segments without a changepoint.

Subject: Methodology

Publish: 2024-10-18 21:20:51 UTC

#17 A New One Parameter Unit Distribution: Median Based Unit Rayleigh (MBUR): Parametric Quantile Regression Model [PDF] [Copy] [Kimi] [REL]

Author: Iman Mohamed Attia

Parametric quantile regression is illustrated for the one parameter new unit Rayleigh distribution called Median Based Unit Rayleigh distribution (MBUR) distribution. The estimation process using re-parameterized maximum likelihood function is highlighted with real dataset example. The inference and goodness of fit is also explored.

Subjects: Methodology ; Probability

Publish: 2024-10-18 20:50:18 UTC

#18 Differentially Private Covariate Balancing Causal Inference [PDF] [Copy] [Kimi1] [REL]

Authors: Yuki Ohnishi ; Jordan Awan

Differential privacy is the leading mathematical framework for privacy protection, providing a probabilistic guarantee that safeguards individuals' private information when publishing statistics from a dataset. This guarantee is achieved by applying a randomized algorithm to the original data, which introduces unique challenges in data analysis by distorting inherent patterns. In particular, causal inference using observational data in privacy-sensitive contexts is challenging because it requires covariate balance between treatment groups, yet checking the true covariates is prohibited to prevent leakage of sensitive information. In this article, we present a differentially private two-stage covariate balancing weighting estimator to infer causal effects from observational data. Our algorithm produces both point and interval estimators with statistical guarantees, such as consistency and rate optimality, under a given privacy budget.

Subjects: Methodology ; Cryptography and Security ; Machine Learning

Publish: 2024-10-18 18:02:13 UTC

#19 Towards more realistic climate model outputs: A multivariate bias correction based on zero-inflated vine copulas [PDF] [Copy] [Kimi] [REL]

Authors: Henri Funk ; Ralf Ludwig ; Helmut Kuechenhoff ; Thomas Nagler

Climate model large ensembles are an essential research tool for analysing and quantifying natural climate variability and providing robust information for rare extreme events. The models simulated representations of reality are susceptible to bias due to incomplete understanding of physical processes. This paper aims to correct the bias of five climate variables from the CRCM5 Large Ensemble over Central Europe at a 3-hourly temporal resolution. At this high temporal resolution, two variables, precipitation and radiation, exhibit a high share of zero inflation. We propose a novel bias-correction method, VBC (Vine copula bias correction), that models and transfers multivariate dependence structures for zero-inflated margins in the data from its error-prone model domain to a reference domain. VBC estimates the model and reference distribution using vine copulas and corrects the model distribution via (inverse) Rosenblatt transformation. To deal with the variables' zero-inflated nature, we develop a new vine density decomposition that accommodates such variables and employs an adequately randomized version of the Rosenblatt transform. This novel approach allows for more accurate modelling of multivariate zero-inflated climate data. Compared with state-of-the-art correction methods, VBC is generally the best-performing correction and the most accurate method for correcting zero-inflated events.

Subjects: Applications ; Methodology

Publish: 2024-10-21 11:59:19 UTC

#20 A Kernelization-Based Approach to Nonparametric Binary Choice Models [PDF] [Copy] [Kimi] [REL]

Author: Guo Yan

We propose a new estimator for nonparametric binary choice models that does not impose a parametric structure on either the systematic function of covariates or the distribution of the error term. A key advantage of our approach is its computational efficiency. For instance, even when assuming a normal error distribution as in probit models, commonly used sieves for approximating an unknown function of covariates can lead to a large-dimensional optimization problem when the number of covariates is moderate. Our approach, motivated by kernel methods in machine learning, views certain reproducing kernel Hilbert spaces as special sieve spaces, coupled with spectral cut-off regularization for dimension reduction. We establish the consistency of the proposed estimator for both the systematic function of covariates and the distribution function of the error term, and asymptotic normality of the plug-in estimator for weighted average partial derivatives. Simulation studies show that, compared to parametric estimation methods, the proposed method effectively improves finite sample performance in cases of misspecification, and has a rather mild efficiency loss if the model is correctly specified. Using administrative data on the grant decisions of US asylum applications to immigration courts, along with nine case-day variables on weather and pollution, we re-examine the effect of outdoor temperature on court judges' "mood", and thus, their grant decisions.

Subjects: Econometrics ; Methodology

Publish: 2024-10-21 07:53:14 UTC

#21 Quantiles and Quantile Regression on Riemannian Manifolds: a measure-transportation-based approach [PDF] [Copy] [Kimi] [REL]

Authors: Marc Hallin ; Hang Liu

Increased attention has been given recently to the statistical analysis of variables with values on nonlinear manifolds. A natural but nontrivial problem in that context is the definition of quantile concepts. We are proposing a solution for compact Riemannian manifolds without boundaries; typical examples are polyspheres, hyperspheres, and toro\"ıdal manifolds equipped with their Riemannian metrics. Our concept of quantile function comes along with a concept of distribution function and, in the empirical case, ranks and signs. The absence of a canonical ordering is offset by resorting to the data-driven ordering induced by optimal transports. Theoretical properties, such as the uniform convergence of the empirical distribution and conditional (and unconditional) quantile functions and distribution-freeness of ranks and signs, are established. Statistical inference applications, from goodness-of-fit to distribution-free rank-based testing, are without number. Of particular importance is the case of quantile regression with directional or toro\"ıdal multiple output, which is given special attention in this paper. Extensive simulations are carried out to illustrate these novel concepts.

Subjects: Statistics Theory ; Geometric Topology ; Methodology

Publish: 2024-10-21 07:31:56 UTC

#22 Accounting for Missing Covariates in Heterogeneous Treatment Estimation [PDF] [Copy] [Kimi] [REL]

Authors: Khurram Yamin ; Vibhhu Sharma ; Ed Kennedy ; Bryan Wilder

Many applications of causal inference require using treatment effects estimated on a study population to make decisions in a separate target population. We consider the challenging setting where there are covariates that are observed in the target population that were not seen in the original study. Our goal is to estimate the tightest possible bounds on heterogeneous treatment effects conditioned on such newly observed covariates. We introduce a novel partial identification strategy based on ideas from ecological inference; the main idea is that estimates of conditional treatment effects for the full covariate set must marginalize correctly when restricted to only the covariates observed in both populations. Furthermore, we introduce a bias-corrected estimator for these bounds and prove that it enjoys fast convergence rates and statistical guarantees (e.g., asymptotic normality). Experimental results on both real and synthetic data demonstrate that our framework can produce bounds that are much tighter than would otherwise be possible.

Subjects: Machine Learning ; Methodology

Publish: 2024-10-21 05:47:07 UTC

#23 Linking Model Intervention to Causal Interpretation in Model Explanation [PDF] [Copy] [Kimi] [REL]

Authors: Debo Cheng ; Ziqi Xu ; Jiuyong Li ; Lin Liu ; Kui Yu ; Thuc Duy Le ; Jixue Liu

Intervention intuition is often used in model explanation where the intervention effect of a feature on the outcome is quantified by the difference of a model prediction when the feature value is changed from the current value to the baseline value. Such a model intervention effect of a feature is inherently association. In this paper, we will study the conditions when an intuitive model intervention effect has a causal interpretation, i.e., when it indicates whether a feature is a direct cause of the outcome. This work links the model intervention effect to the causal interpretation of a model. Such an interpretation capability is important since it indicates whether a machine learning model is trustworthy to domain experts. The conditions also reveal the limitations of using a model intervention effect for causal interpretation in an environment with unobserved features. Experiments on semi-synthetic datasets have been conducted to validate theorems and show the potential for using the model intervention effect for model interpretation.

Subjects: Machine Learning ; Methodology

Publish: 2024-10-21 05:16:59 UTC

#24 Reward Maximization for Pure Exploration: Minimax Optimal Good Arm Identification for Nonparametric Multi-Armed Bandits [PDF] [Copy] [Kimi] [REL]

Authors: Brian Cho ; Dominik Meier ; Kyra Gan ; Nathan Kallus

In multi-armed bandits, the tasks of reward maximization and pure exploration are often at odds with each other. The former focuses on exploiting arms with the highest means, while the latter may require constant exploration across all arms. In this work, we focus on good arm identification (GAI), a practical bandit inference objective that aims to label arms with means above a threshold as quickly as possible. We show that GAI can be efficiently solved by combining a reward-maximizing sampling algorithm with a novel nonparametric anytime-valid sequential test for labeling arm means. We first establish that our sequential test maintains error control under highly nonparametric assumptions and asymptotically achieves the minimax optimal e-power, a notion of power for anytime-valid tests. Next, by pairing regret-minimizing sampling schemes with our sequential test, we provide an approach that achieves minimax optimal stopping times for labeling arms with means above a threshold, under an error probability constraint. Our empirical results validate our approach beyond the minimax setting, reducing the expected number of samples for all stopping times by at least 50% across both synthetic and real-world settings.

Subjects: Machine Learning ; Methodology ; Machine Learning

Publish: 2024-10-21 01:19:23 UTC

#25 Structural Causality-based Generalizable Concept Discovery Models [PDF] [Copy] [Kimi] [REL]

Authors: Sanchit Sinha ; Guangzhi Xiong ; Aidong Zhang

The rising need for explainable deep neural network architectures has utilized semantic concepts as explainable units. Several approaches utilizing disentangled representation learning estimate the generative factors and utilize them as concepts for explaining DNNs. However, even though the generative factors for a dataset remain fixed, concepts are not fixed entities and vary based on downstream tasks. In this paper, we propose a disentanglement mechanism utilizing a variational autoencoder (VAE) for learning mutually independent generative factors for a given dataset and subsequently learning task-specific concepts using a structural causal model (SCM). Our method assumes generative factors and concepts to form a bipartite graph, with directed causal edges from generative factors to concepts. Experiments are conducted on datasets with known generative factors: D-sprites and Shapes3D. On specific downstream tasks, our proposed method successfully learns task-specific concepts which are explained well by the causal edges from the generative factors. Lastly, separate from current causal concept discovery methods, our methodology is generalizable to an arbitrary number of concepts and flexible to any downstream tasks.

Subjects: Machine Learning ; Methodology

Publish: 2024-10-20 20:09:47 UTC