Processing math: 5%

cohen23a@v195@PMLR

Total: 1

#1 Local Glivenko-Cantelli [PDF1] [Copy] [Kimi1] [REL]

Authors: Doron Cohen, Aryeh Kontorovich

If μ is a distribution over the d-dimensional Boolean cube \set{0,1}^d, our goal is to estimate its mean p\in[0,1]^d based on n iid draws from \mu. Specifically, we consider the empirical mean estimator \pn and study the expected maximal deviation \Delta_n=\E\max_{j\in[d]}|\pn(j)-p(j)|. In the classical Universal Glivenko-Cantelli setting, one seeks distribution-free (i.e., independent of \mu) bounds on \Delta_n. This regime is well-understood: for all \mu, we have \Delta_n\lesssim\sqrt{\log(d)/n} up to universal constants, and the bound is tight.Our present work seeks to establish dimension-free (i.e., without an explicit dependence on d) estimates on \Delta_n, including those that hold for d=\infty. As such bounds must necessarily depend on \mu, we refer to this regime as {\em local} Glivenko-Cantelli (also known as \mu-GC), and are aware of very few previous bounds of this type — which are either “abstract” or quite sub-optimal. Already the special case of product measures \mu is rather non-trivial. We give necessary and sufficient conditions on \mu for \Delta_n\to0, and calculate sharp rates for this decay. Along the way, we discover a novel sub-gamma-type maximal inequality for shifted Bernoullis, of independent interest.

Subject: COLT.2023 - Accept