36uy2GgAy6@OpenReview

Total: 1

#1 When Data Can't Meet: Estimating Correlation Across Privacy Barriers [PDF1] [Copy] [Kimi] [REL]

Authors: Abhinav Chakraborty, Arnab Auddy, T. Tony Cai

We consider the problem of estimating the correlation of two random variables $X$ and $Y$, where the pairs $(X,Y)$ are not observed together, but are instead separated co-ordinate-wise at two servers: server 1 contains all the $X$ observations, and server 2 contains the corresponding $Y$ observations. In this vertically distributed setting, we assume that each server has its own privacy constraints, owing to which they can only share suitably privatized statistics of their own component observations. We consider differing privacy budgets $(\varepsilon_1,\delta_1)$ and $(\varepsilon_2,\delta_2)$ for the two servers and determine the minimax optimal rates for correlation estimation allowing for both non-interactive and interactive mechanisms. We also provide correlation estimators that achieve these rates and further develop inference procedures, namely, confidence intervals, for the estimated correlations. Our results are characterized by an interesting rate in terms of the sample size $n$, $\varepsilon_1$, $\varepsilon_2$, which is strictly slower than the usual central privacy estimation rates. More interestingly, we find that the interactive mechanism is always better than its non-interactive counterpart whenever the two privacy budgets are different. Results from extensive numerical experiments support our theoretical findings.

Subject: NeurIPS.2025 - Spotlight