9rwtezthwo@OpenReview

Total: 1

#1 The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains [PDF] [Copy] [Kimi] [REL]

Authors: Scott Geng, Hamish Ivison, Chun-Liang Li, Maarten Sap, Jerry Li, Ranjay Krishna, Pang Wei Koh

Improvements in language models are often driven by increasing the quality of the data we train them on, which can be limiting when strong supervision is not readily available. In this work, we show that paired preference data consisting of individually weak data points can enable gains beyond the strength of each individual sample. We formulate the **delta learning hypothesis** to explain this phenomenon, positing that the relative quality _delta_ between points suffices to drive learning via preference tuning—even when supervised finetuning on the weak data hurts. We validate our hypothesis in controlled experiments and at scale, where we post-train 8B models on preference data generated by pairing a small 3B model's responses with outputs from an even smaller 1.5B model to ensure a meaningful delta. Strikingly, on a standard 11-benchmark evaluation suite (MATH, MMLU, etc.), our simple recipe matches the performance of Tülu 3, a state-of-the-art open model that was tuned from the same base as our model while relying on vastly stronger supervisors (e.g., GPT-4o). Delta learning thus enables simpler and cheaper open recipes for state-of-the-art post-training, highlighting that models can learn a surprising amount from data that might typically be considered weak.

Subject: COLM.2025