2606.19781

Total: 1

#1 Towards Engineering Scaling Laws with Pretraining Data Composition [PDF] [Copy] [Kimi] [REL]

Authors: Jan-Lucas Uslu, Kevin Greif, Daniel Whiteson, Benjamin Nachman

Neural scaling laws describe how model performance improves as a power law in compute, model size, and dataset size. While well-established for large language models, these relationships are emerging for large models in particle physics. As with language, empirical studies show that the performance scales as a power law. However, unlike natural language or image domains, fundamental physics has high-fidelity simulators that produce synthetic data cheaply. This favors scaling regimes where additional data is cheaper than additional parameters, and allows the pretraining dataset itself to be engineered to influence the scaling. For the task of classifying hadronic jets produced in collisions of high-energy particle beams, we show that the scaling behavior can be engineered towards requiring more data rather than larger models by inclusion of pretraining data which is more diverse and better aligned with the downstream classification task.

Subjects: High Energy Physics - Experiment , Artificial Intelligence

Publish: 2026-06-18 04:32:06 UTC