2605.28238

Total: 1

#1 Approximate Label Symmetries Improve Data Scaling [PDF] [Copy] [Kimi] [REL]

Authors: Scott Y. H. Kim, Mathis Lechaume-Robert, O. Anatole von Lilienfeld

Enforcing universal symmetries in machine learning (ML) models is a common strategy to mitigate data scarcity. We show that exploiting exact, as well as approximate, label symmetries can benefit scaling laws. We illustrate the idea for the s, p, d orbital densities of the electron in the hydrogen atom, for the three vibrational normal modes of the water molecule, as well as its full 3D potential energy hypersurface. Resulting ML models of electron density and potential energies exhibit superior learning curves, demonstrating improved generalization efficiency. When label symmetries are not exact, the same principles govern the observed learning behavior -- up to the convergence floors set by the degree to which the symmetry is approximate. For convex wells in the molecular potential energy surface, a Hessian-based correction suppresses the leading symmetry-breaking error in augmented labels.

Subject: Chemical Physics

Publish: 2026-05-27 09:53:59 UTC