2024-11-06 | | Total: 8

For the distributions of finitely many binary random variables, we study the interaction of restrictions of the supports with conditional independence constraints. We prove a generalization of the Hammersley-Clifford theorem for distributions whose support is a natural distributive lattice: that is, any distribution which has natural lattice support and satisfies the pairwise Markov statements of a graph must factor according to the graph. We also show a connection to the Hibi ideals of lattices.

Resampling methods are especially well-suited to inference with estimators that provide only "black-box'' access. Jackknife is a form of resampling, widely used for bias correction and variance estimation, that is well-understood under classical scaling where the sample size $n$ grows for a fixed problem. We study its behavior in application to estimating functionals using high-dimensional $Z$-estimators, allowing both the sample size $n$ and problem dimension $d$ to diverge. We begin showing that the plug-in estimator based on the $Z$-estimate suffers from a quadratic breakdown: while it is $\sqrt{n}$-consistent and asymptotically normal whenever $n \gtrsim d^2$, it fails for a broad class of problems whenever $n \lesssim d^2$. We then show that under suitable regularity conditions, applying a jackknife correction yields an estimate that is $\sqrt{n}$-consistent and asymptotically normal whenever $n\gtrsim d^{3/2}$. This provides strong motivation for the use of jackknife in high-dimensional problems where the dimension is moderate relative to sample size. We illustrate consequences of our general theory for various specific $Z$-estimators, including non-linear functionals in linear models; generalized linear models; and the inverse propensity score weighting (IPW) estimate for the average treatment effect, among others.

The Bethe-Hessian matrix, introduced by Saade, Krzakala, and Zdeborová (2014), is a Hermitian matrix designed for applying spectral clustering algorithms to sparse networks. Rather than employing a non-symmetric and high-dimensional non-backtracking operator, a spectral method based on the Bethe-Hessian matrix is conjectured to also reach the Kesten-Stigum detection threshold in the sparse stochastic block model (SBM). We provide the first rigorous analysis of the Bethe-Hessian spectral method in the SBM under both the bounded expected degree and the growing degree regimes. Specifically, we demonstrate that: (i) When the expected degree $d\geq 2$, the number of negative outliers of the Bethe-Hessian matrix can consistently estimate the number of blocks above the Kesten-Stigum threshold, thus confirming a conjecture from Saade, Krzakala, and Zdeborová (2014) for $d\geq 2$. (ii) For sufficiently large $d$, its eigenvectors can be used to achieve weak recovery. (iii) As $d\to\infty$, we establish the concentration of the locations of its negative outlier eigenvalues, and weak consistency can be achieved via a spectral method based on the Bethe-Hessian matrix.

We introduce the extremal range, a local statistic for studying the spatial extent of extreme events in random fields on $\mathbb{R}^d$. Conditioned on exceedance of a high threshold at a location $s$, the extremal range at $s$ is the random variable defined as the smallest distance from $s\in\mathbb{R}^d$ to a location where there is a nonexceedance. We leverage tools from excursion-set theory, such as Lipschitz- Killing curvatures, to express distributional properties of the extremal range, including asymptotics for small distances and high thresholds. The extremal range captures the rate at which the spatial extent of conditional extreme events scales for increasingly high thresholds, and we relate its distributional properties with the well-known bivariate tail dependence coefficient and the extremal index of time series in Extreme-Value Theory. We calculate theoretical extremal-range properties for commonly used models, such as Gaussian or regularly varying random fields. Numerical studies illustrate that, when the extremal range is estimated from discretized excursion sets observed on compact observation windows, the distribution of the resulting estimators appropriately reproduces the theoretically derived links with the Lipschitz- Killing curvature densities.

In many research fields, researchers aim to identify significant associations between a set of explanatory variables and a response while controlling the false discovery rate (FDR). To this aim, we develop a fully Bayesian generalization of the classical model-X knockoff filter. Knockoff filter introduces controlled noise in the model in the form of cleverly constructed copies of the predictors as auxiliary variables. In our approach we consider the joint model of the covariates and the response and incorporate the conditional independence structure of the covariates into the prior distribution of the auxiliary knockoff variables. We further incorporate the estimation of a graphical model among the covariates, which in turn aids knockoffs generation and improves the estimation of the covariate effects on the response. We use a modified spike-and-slab prior on the regression coefficients, which avoids the increase of the model dimension as typical in the classical knockoff filter. Our model performs variable selection using an upper bound on the posterior probability of non-inclusion. We show how our model construction leads to valid model-X knockoffs and demonstrate that the proposed characterization is sufficient for controlling the BFDR at an arbitrary level, in finite samples. We also show that the model selection is robust to the estimation of the precision matrix. We use simulated data to demonstrate that our proposal increases the stability of the selection with respect to classical knockoff methods, as it relies on the entire posterior distribution of the knockoff variables instead of a single sample. With respect to Bayesian variable selection methods, we show that our selection procedure achieves comparable or better performances, while maintaining control over the FDR. Finally, we show the usefulness of the proposed model with an application to real data.

In causal inference, many estimands of interest can be expressed as a linear functional of the outcome regression function; this includes, for example, average causal effects of static, dynamic and stochastic interventions. For learning such estimands, in this work, we propose novel debiased machine learning estimators that are doubly robust asymptotically linear, thus providing not only doubly robust consistency but also facilitating doubly robust inference (e.g., confidence intervals and hypothesis tests). To do so, we first establish a key link between calibration, a machine learning technique typically used in prediction and classification tasks, and the conditions needed to achieve doubly robust asymptotic linearity. We then introduce calibrated debiased machine learning (C-DML), a unified framework for doubly robust inference, and propose a specific C-DML estimator that integrates cross-fitting, isotonic calibration, and debiased machine learning estimation. A C-DML estimator maintains asymptotic linearity when either the outcome regression or the Riesz representer of the linear functional is estimated sufficiently well, allowing the other to be estimated at arbitrarily slow rates or even inconsistently. We propose a simple bootstrap-assisted approach for constructing doubly robust confidence intervals. Our theoretical and empirical results support the use of C-DML to mitigate bias arising from the inconsistent or slow estimation of nuisance functions.

If the probability model is correctly specified, then we can estimate the covariance matrix of the asymptotic maximum likelihood estimate distribution using either the first or second derivatives of the likelihood function. Therefore, if the determinants of these two different covariance matrix estimation formulas differ this indicates model misspecification. This misspecification detection strategy is the basis of the Determinant Information Matrix Test ($GIMT_{Det}$). To investigate the performance of the $GIMT_{Det}$, a Deterministic Input Noisy And gate (DINA) Cognitive Diagnostic Model (CDM) was fit to the Fraction-Subtraction dataset. Next, various misspecified versions of the original DINA CDM were fit to bootstrap data sets generated by sampling from the original fitted DINA CDM. The $GIMT_{Det}$ showed good discrimination performance for larger levels of misspecification. In addition, the $GIMT_{Det}$ did not detect model misspecification when model misspecification was not present and additionally did not detect model misspecification when the level of misspecification was very low. However, the $GIMT_{Det}$ discrimation performance was highly variable across different misspecification strategies when the misspecification level was moderately sized. The proposed new misspecification detection methodology is promising but additional empirical studies are required to further characterize its strengths and limitations.

Point processes are widely used statistical models for uncovering the temporal patterns in dependent event data. In many applications, the event time cannot be observed exactly, calling for the incorporation of time uncertainty into the modeling of point process data. In this work, we introduce a framework to model time-uncertain point processes possibly on a network. We start by deriving the formulation in the continuous-time setting under a few assumptions motivated by application scenarios. After imposing a time grid, we obtain a discrete-time model that facilitates inference and can be computed by first-order optimization methods such as Gradient Descent or Variation inequality (VI) using batch-based Stochastic Gradient Descent (SGD). The parameter recovery guarantee is proved for VI inference at an $O(1/k)$ convergence rate using $k$ SGD steps. Our framework handles non-stationary processes by modeling the inference kernel as a matrix (or tensor on a network) and it covers the stationary process, such as the classical Hawkes process, as a special case. We experimentally show that the proposed approach outperforms previous General Linear model (GLM) baselines on simulated and real data and reveals meaningful causal relations on a Sepsis-associated Derangements dataset.