https://papers.cool/arxiv/econ.EMEconometrics2024-06-21T00:00:00+00:00python-feedgenCool Papers - Immersive Paper Discoveryhttps://papers.cool/arxiv/2406.11308Management Decisions in Manufacturing using Causal Machine Learning -- To Rework, or not to Rework?2024-06-21T00:00:00+00:00Philipp SchwarzOliver SchachtSven KlaassenDaniel GrünbaumSebastian ImhofMartin SpindlerIn this paper, we present a data-driven model for estimating optimal rework policies in manufacturing systems. We consider a single production stage within a multistage, lot-based system that allows for optional rework steps. While the rework decision depends on an intermediate state of the lot and system, the final product inspection, and thus the assessment of the actual yield, is delayed until production is complete. Repair steps are applied uniformly to the lot, potentially improving some of the individual items while degrading others. The challenge is thus to balance potential yield improvement with the rework costs incurred. Given the inherently causal nature of this decision problem, we propose a causal model to estimate yield improvement. We apply methods from causal machine learning, in particular double/debiased machine learning (DML) techniques, to estimate conditional treatment effects from data and derive policies for rework decisions. We validate our decision model using real-world data from opto-electronic semiconductor manufacturing, achieving a yield improvement of 2 - 3% during the color-conversion process of white light-emitting diodes (LEDs).https://papers.cool/arxiv/2406.14145Temperature in the Iberian Peninsula: Trend, seasonality, and heterogeneity2024-06-21T00:00:00+00:00C. Vladimir Rodríguez-CaballeroEsther RuizIn this paper, we propose fitting unobserved component models to represent the dynamic evolution of bivariate systems of centre and log-range temperatures obtained monthly from minimum/maximum temperatures observed at a given location. In doing so, the centre and log-range temperature are decomposed into potentially stochastic trends, seasonal, and transitory components. Since our model encompasses deterministic trends and seasonal components as limiting cases, we contribute to the debate on whether stochastic or deterministic components better represent the trend and seasonal components. The methodology is implemented to centre and log-range temperature observed in four locations in the Iberian Peninsula, namely, Barcelona, Coru\~{n}a, Madrid, and Seville. We show that, at each location, the centre temperature can be represented by a smooth integrated random walk with time-varying slope, while a stochastic level better represents the log-range. We also show that centre and log-range temperature are unrelated. The methodology is then extended to simultaneously model centre and log-range temperature observed at several locations in the Iberian Peninsula. We fit a multi-level dynamic factor model to extract potential commonalities among centre (log-range) temperature while also allowing for heterogeneity in different areas in the Iberian Peninsula. We show that, although the commonality in trends of average temperature is considerable, the regional components are also relevant.https://papers.cool/arxiv/2406.13122Testing for Underpowered Literatures2024-06-21T00:00:00+00:00Stefan FaridaniHow many experimental studies would have come to different conclusions had they been run on larger samples? I show how to estimate the expected number of statistically significant results that a set of experiments would have reported had their sample sizes all been counterfactually increased by a chosen factor. The estimator is consistent and asymptotically normal. Unlike existing methods, my approach requires no assumptions about the distribution of true effects of the interventions being studied other than continuity. This method includes an adjustment for publication bias in the reported t-scores. An application to randomized controlled trials (RCTs) published in top economics journals finds that doubling every experiment's sample size would only increase the power of two-sided t-tests by 7.2 percentage points on average. This effect is small and is comparable to the effect for systematic replication projects in laboratory psychology where previous studies enabled accurate power calculations ex ante. These effects are both smaller than for non-RCTs. This comparison suggests that RCTs are on average relatively insensitive to sample size increases. The policy implication is that grant givers should generally fund more experiments rather than fewer, larger ones.https://papers.cool/arxiv/2406.13395Bayesian Inference for Multidimensional Welfare Comparisons2024-06-21T00:00:00+00:00David GunawanWilliam GriffithsDuangkamon ChotikapanichUsing both single-index measures and stochastic dominance concepts, we show how Bayesian inference can be used to make multivariate welfare comparisons. A four-dimensional distribution for the well-being attributes income, mental health, education, and happiness are estimated via Bayesian Markov chain Monte Carlo using unit-record data taken from the Household, Income and Labour Dynamics in Australia survey. Marginal distributions of beta and gamma mixtures and discrete ordinal distributions are combined using a copula. Improvements in both well-being generally and poverty magnitude are assessed using posterior means of single-index measures and posterior probabilities of stochastic dominance. The conditions for stochastic dominance depend on the class of utility functions that is assumed to define a social welfare function and the number of attributes in the utility function. Three classes of utility functions are considered, and posterior probabilities of dominance are computed for one, two, and four-attribute utility functions for three time intervals within the period 2001 to 2019.https://papers.cool/arxiv/2406.13826Testing identification in mediation and dynamic treatment models2024-06-21T00:00:00+00:00Martin HuberKevin KloiberLukas LaffersWe propose a test for the identification of causal effects in mediation and dynamic treatment models that is based on two sets of observed variables, namely covariates to be controlled for and suspected instruments, building on the test by Huber and Kueck (2022) for single treatment models. We consider models with a sequential assignment of a treatment and a mediator to assess the direct treatment effect (net of the mediator), the indirect treatment effect (via the mediator), or the joint effect of both treatment and mediator. We establish testable conditions for identifying such effects in observational data. These conditions jointly imply (1) the exogeneity of the treatment and the mediator conditional on covariates and (2) the validity of distinct instruments for the treatment and the mediator, meaning that the instruments do not directly affect the outcome (other than through the treatment or mediator) and are unconfounded given the covariates. Our framework extends to post-treatment sample selection or attrition problems when replacing the mediator by a selection indicator for observing the outcome, enabling joint testing of the selectivity of treatment and attrition. We propose a machine learning-based test to control for covariates in a data-driven manner and analyze its finite sample performance in a simulation study. Additionally, we apply our method to Slovak labor market data and find that our testable implications are not rejected for a sequence of training programs typically considered in dynamic treatment evaluations.https://papers.cool/arxiv/2406.14046Estimating Time-Varying Parameters of Various Smoothness in Linear Models via Kernel Regression2024-06-21T00:00:00+00:00Mikihito NishiWe consider estimating nonparametric time-varying parameters in linear models using kernel regression. Our contributions are twofold. First, We consider a broad class of time-varying parameters including deterministic smooth functions, the rescaled random walk, structural breaks, the threshold model and their mixtures. We show that those time-varying parameters can be consistently estimated by kernel regression. Our analysis exploits the smoothness of time-varying parameters rather than their specific form. The second contribution is to reveal that the bandwidth used in kernel regression determines the trade-off between the rate of convergence and the size of the class of time-varying parameters that can be estimated. An implication from our result is that the bandwidth should be proportional to $T^{-1/2}$ if the time-varying parameter follows the rescaled random walk, where $T$ is the sample size. We propose a specific choice of the bandwidth that accommodates a wide range of time-varying parameter models. An empirical application shows that the kernel-based estimator with this choice can capture the random-walk dynamics in time-varying parameters.https://papers.cool/arxiv/2406.14380Estimating Treatment Effects under Recommender Interference: A Structured Neural Networks Approach2024-06-21T00:00:00+00:00Ruohan ZhanShichao HanYuchen HuZhenling JiangRecommender systems are essential for content-sharing platforms by curating personalized content. To evaluate updates of recommender systems targeting content creators, platforms frequently engage in creator-side randomized experiments to estimate treatment effect, defined as the difference in outcomes when a new (vs. the status quo) algorithm is deployed on the platform. We show that the standard difference-in-means estimator can lead to a biased treatment effect estimate. This bias arises because of recommender interference, which occurs when treated and control creators compete for exposure through the recommender system. We propose a "recommender choice model" that captures how an item is chosen among a pool comprised of both treated and control content items. By combining a structural choice model with neural networks, the framework directly models the interference pathway in a microfounded way while accounting for rich viewer-content heterogeneity. Using the model, we construct a double/debiased estimator of the treatment effect that is consistent and asymptotically normal. We demonstrate its empirical performance with a field experiment on Weixin short-video platform: besides the standard creator-side experiment, we carry out a costly blocked double-sided randomization design to obtain a benchmark estimate without interference bias. We show that the proposed estimator significantly reduces the bias in treatment effect estimates compared to the standard difference-in-means estimator.