https://papers.cool/arxiv/stat.MEMethodology2024-08-15T00:00:00+00:00python-feedgenCool Papers - Immersive Paper Discoveryhttps://papers.cool/arxiv/2404.02400On Improved Semi-parametric Bounds for Tail Probability and Expected Loss2024-08-15T00:00:00+00:00Zhaolin LiArtem ProkhorovWe revisit the fundamental issue of tail behavior of accumulated random realizations when individual realizations are independent, and we develop new sharper bounds on the tail probability and expected linear loss. The underlying distribution is semi-parametric in the sense that it remains unrestricted other than the assumed mean and variance. Our sharp bounds complement well-established results in the literature, including those based on aggregation, which often fail to take full account of independence and use less elegant proofs. New insights include a proof that in the non-identical case, the distributions attaining the bounds have the equal range property, and that the impact of each random variable on the expected value of the sum can be isolated using an extension of the Korkine identity. We show that the new bounds not only complement the extant results but also open up abundant practical applications, including improved pricing of product bundles, more precise option pricing, more efficient insurance design, and better inventory management.https://papers.cool/arxiv/2408.07209Local linear smoothing for regression surfaces on the simplex using Dirichlet kernels2024-08-15T00:00:00+00:00Christian GenestFrédéric OuimetThis paper introduces a local linear smoother for regression surfaces on the simplex. The estimator solves a least-squares regression problem weighted by a locally adaptive Dirichlet kernel, ensuring excellent boundary properties. Asymptotic results for the bias, variance, mean squared error, and mean integrated squared error are derived, generalizing the univariate results of Chen (2002). A simulation study shows that the proposed local linear estimator with Dirichlet kernel outperforms its only direct competitor in the literature, the Nadaraya-Watson estimator with Dirichlet kernel due to Bouzebda, Nezzal, and Elhattab (2024).https://papers.cool/arxiv/2408.07219Causal Effect Estimation using identifiable Variational AutoEncoder with Latent Confounders and Post-Treatment Variables2024-08-15T00:00:00+00:00Yang XieZiqi XuDebo ChengJiuyong LiLin LiuYinghao ZhangZaiwen FengEstimating causal effects from observational data is challenging, especially in the presence of latent confounders. Much work has been done on addressing this challenge, but most of the existing research ignores the bias introduced by the post-treatment variables. In this paper, we propose a novel method of joint Variational AutoEncoder (VAE) and identifiable Variational AutoEncoder (iVAE) for learning the representations of latent confounders and latent post-treatment variables from their proxy variables, termed CPTiVAE, to achieve unbiased causal effect estimation from observational data. We further prove the identifiability in terms of the representation of latent post-treatment variables. Extensive experiments on synthetic and semi-synthetic datasets demonstrate that the CPTiVAE outperforms the state-of-the-art methods in the presence of latent confounders and post-treatment variables. We further apply CPTiVAE to a real-world dataset to show its potential application.https://papers.cool/arxiv/2408.07298Improving the use of social contact studies in epidemic modelling2024-08-15T00:00:00+00:00Tom BrittonFrank BallSocial contact studies, investigating social contact patterns in a population sample, have been an important contribution for epidemic models to better fit real life epidemics. A contact matrix $M$, having the \emph{mean} number of contacts between individuals of different age groups as its elements, is estimated and used in combination with a multitype epidemic model to produce better data fitting and also giving more appropriate expressions for $R_0$ and other model outcomes. However, $M$ does not capture \emph{variation} in contacts \emph{within} each age group, which is often large in empirical settings. Here such variation within age groups is included in a simple way by dividing each age group into two halves: the socially active and the socially less active. The extended contact matrix, and its associated epidemic model, empirically show that acknowledging variation in social activity within age groups has a substantial impact on modelling outcomes such as $R_0$ and the final fraction $\tau$ getting infected. In fact, the variation in social activity within age groups is often more important for data fitting than the division into different age groups itself. However, a difficulty with heterogeneity in social activity is that social contact studies typically lack information on if mixing with respect to social activity is assortative or not, i.e.\ do socially active tend to mix more with other socially active or more with socially less active? The analyses show that accounting for heterogeneity in social activity improves the analyses irrespective of if such mixing is assortative or not, but the different assumptions gives rather different output. Future social contact studies should hence also try to infer the degree of assortativity of contacts with respect to social activity.https://papers.cool/arxiv/2408.07575A General Framework for Constraint-based Causal Learning2024-08-15T00:00:00+00:00Kai Z. TehKayvan SadeghiTerry SooBy representing any constraint-based causal learning algorithm via a placeholder property, we decompose the correctness condition into a part relating the distribution and the true causal graph, and a part that depends solely on the distribution. This provides a general framework to obtain correctness conditions for causal learning, and has the following implications. We provide exact correctness conditions for the PC algorithm, which are then related to correctness conditions of some other existing causal discovery algorithms. We show that the sparsest Markov representation condition is the weakest correctness condition resulting from existing notions of minimality for maximal ancestral graphs and directed acyclic graphs. We also reason that additional knowledge than just Pearl-minimality is necessary for causal learning beyond faithfulness.https://papers.cool/arxiv/2408.07193A comparison of methods for estimating the average treatment effect on the treated for externally controlled trials2024-08-15T00:00:00+00:00Huan WangFei WuYeh-Fong ChenWhile randomized trials may be the gold standard for evaluating the effectiveness of the treatment intervention, in some special circumstances, single-arm clinical trials utilizing external control may be considered. The causal treatment effect of interest for single-arm studies is usually the average treatment effect on the treated (ATT) rather than the average treatment effect (ATE). Although methods have been developed to estimate the ATT, the selection and use of these methods require a thorough comparison and in-depth understanding of the advantages and disadvantages of these methods. In this study, we conducted simulations under different identifiability assumptions to compare the performance metrics (e.g., bias, standard deviation (SD), mean squared error (MSE), type I error rate) for a variety of methods, including the regression model, propensity score matching, Mahalanobis distance matching, coarsened exact matching, inverse probability weighting, augmented inverse probability weighting (AIPW), AIPW with SuperLearner, and targeted maximum likelihood estimator (TMLE) with SuperLearner. Our simulation results demonstrate that the doubly robust methods in general have smaller biases than other methods. In terms of SD, nonmatching methods in general have smaller SDs than matching-based methods. The performance of MSE is a trade-off between the bias and SD, and no method consistently performs better in term of MSE. The identifiability assumptions are critical to the models' performance: violation of the positivity assumption can lead to a significant inflation of type I errors in some methods; violation of the unconfoundedness assumption can lead to a large bias for all methods... (Further details are available in the main body of the paper).https://papers.cool/arxiv/2408.07231Estimating the FDR of variable selection2024-08-15T00:00:00+00:00Yixiang LuoWilliam FithianLihua LeiWe introduce a generic estimator for the false discovery rate of any model selection procedure, in common statistical modeling settings including the Gaussian linear model, Gaussian graphical model, and model-X setting. We prove that our method has a conservative (non-negative) bias in finite samples under standard statistical assumptions, and provide a bootstrap method for assessing its standard error. For methods like the Lasso, forward-stepwise regression, and the graphical Lasso, our estimator serves as a valuable companion to cross-validation, illuminating the tradeoff between prediction error and variable selection accuracy as a function of the model complexity parameter.https://papers.cool/arxiv/2408.07240Sensitivity of MCMC-based analyses to small-data removal2024-08-15T00:00:00+00:00Tin D. NguyenRyan GiordanoRachael MeagerTamara BroderickIf the conclusion of a data analysis is sensitive to dropping very few data points, that conclusion might hinge on the particular data at hand rather than representing a more broadly applicable truth. How could we check whether this sensitivity holds? One idea is to consider every small subset of data, drop it from the dataset, and re-run our analysis. But running MCMC to approximate a Bayesian posterior is already very expensive; running multiple times is prohibitive, and the number of re-runs needed here is combinatorially large. Recent work proposes a fast and accurate approximation to find the worst-case dropped data subset, but that work was developed for problems based on estimating equations -- and does not directly handle Bayesian posterior approximations using MCMC. We make two principal contributions in the present work. We adapt the existing data-dropping approximation to estimators computed via MCMC. Observing that Monte Carlo errors induce variability in the approximation, we use a variant of the bootstrap to quantify this uncertainty. We demonstrate how to use our approximation in practice to determine whether there is non-robustness in a problem. Empirically, our method is accurate in simple models, such as linear regression. In models with complicated structure, such as hierarchical models, the performance of our method is mixed.https://papers.cool/arxiv/2408.07365Fast Bayesian inference in a class of sparse linear mixed effects models2024-08-15T00:00:00+00:00M-Z. SpyropoulouJ. HopkerJ. E. GriffinLinear mixed effects models are widely used in statistical modelling. We consider a mixed effects model with Bayesian variable selection in the random effects using spike-and-slab priors and developed a variational Bayes inference scheme that can be applied to large data sets. An EM algorithm is proposed for the model with normal errors where the posterior distribution of the variable inclusion parameters is approximated using an Occam's window approach. Placing this approach within a variational Bayes scheme also the algorithm to be extended to the model with skew-t errors. The performance of the algorithm is evaluated in a simulation study and applied to a longitudinal model for elite athlete performance in the 100 metre sprint and weightlifting.https://papers.cool/arxiv/2408.07463A novel framework for quantifying nominal outlyingness2024-08-15T00:00:00+00:00Efthymios CostaIoanna PapatsoumaOutlier detection is an important data mining tool that becomes particularly challenging when dealing with nominal data. First and foremost, flagging observations as outlying requires a well-defined notion of nominal outlyingness. This paper presents a definition of nominal outlyingness and introduces a general framework for quantifying outlyingness of nominal data. The proposed framework makes use of ideas from the association rule mining literature and can be used for calculating scores that indicate how outlying a nominal observation is. Methods for determining the involved hyperparameter values are presented and the concepts of variable contributions and outlyingness depth are introduced, in an attempt to enhance interpretability of the results. An implementation of the framework is tested on five real-world data sets and the key findings are outlined. The ideas presented can serve as a tool for assessing the degree to which an observation differs from the rest of the data, under the assumption of sequences of nominal levels having been generated from a Multinomial distribution with varying event probabilities.