Gradient Flow Provably Learns Robust Classifiers for Orthonormal GMMs

#1 Gradient Flow Provably Learns Robust Classifiers for Orthonormal GMMs [PDF¹] [Copy] [Kimi] [REL]

Deep learning-based classifiers are known to be vulnerable to adversarial attacks. Existing methods for defending against such attacks require adding a defense mechanism or modifying the learning procedure (e.g., by adding adversarial examples). This paper shows that for certain data distributions one can learn a provably robust classifier using standard learning methods and without adding a defense mechanism. More specifically, this paper addresses the problem of finding a robust classifier for a binary classification problem in which the data comes from an isotropic mixture of Gaussians with orthonormal cluster centers. First, we characterize the largest $\ell_2$-attack any classifier can defend against while maintaining high accuracy, and show the existence of optimal robust classifiers achieving this maximum $\ell_2$-robustness. Next, we show that given data from the orthonormal Gaussian mixture model, gradient flow on a two-layer network with a polynomial ReLU activation and without adversarial examples provably finds an optimal robust classifier.

Subject: ICML.2025 - Poster

y6aT5rlrlv@OpenReview

#1 Gradient Flow Provably Learns Robust Classifiers for Orthonormal GMMs [PDF1] [Copy] [Kimi] [REL]

#1 Gradient Flow Provably Learns Robust Classifiers for Orthonormal GMMs [PDF¹] [Copy] [Kimi] [REL]