2025.findings-emnlp.642@ACL

Total: 1

#1 ReAlign: Structured Revision for Small Language Model Alignment [PDF] [Copy] [Kimi] [REL]

Authors: Ruijun Chen, Jiajian Guo, Hongzhan Chen, Fanqi Wan, Qifan Wang, Xiaojun Quan

Aligning small language models with human preferences is challenging, as weak policies struggle to generate informative on-policy samples and suffer from unstable gradients when trained on off-policy signals from stronger models. In this work, we propose ReAlign, a training framework that combines the stability of on-policy learning with the guidance of reviser-assisted supervision. In the ReAlign, we first train a lightweight reviser to improve policy-generated responses using preference-based supervision, conditioned on both the prompt and the initial output. And then, the policy is optimized using a combination of standard on-policy preference pairs and reviser-enhanced pairs constructed as a structured revision task, where the latter provide richer, more learnable feedback. Experimental results on AlpacaEval-2 and Arena-Hard demonstrate that ReAlign significantly boosts alignment performance for weak policies, outperforming strong preference optimization baselines.

Subject: EMNLP.2025 - Findings