Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models

#1 Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models [PDF] [Copy] [Kimi] [REL]

Authors: Qingyu Ren, Jie Zeng, Qianyu He, Jiaqing Liang, Yanghua Xiao, Weikang Zhou, Zeye Sun, Fei Yu

It is crucial for large language models (LLMs) to follow instructions that involve multiple constraints. In real-world scenarios, user instructions often contain soft constraints, which are semantically related and cannot be rule-based verified, posing challenges for LLMs. To enhance the soft constraint following ability of LLMs, we initially design a pipeline to construct datasets with high-quality outputs for instructions containing soft constraints automatically. Additionally, to fully utilize the positive and negative samples generated during the data construction process, we choose Direct Preference Optimization (DPO) as the training method. Furthermore, taking into account the difficulty of soft constraints indicated by the number of constraints, we design a curriculum learning training paradigm based on the constraint quantity. We experimentally evaluate the effectiveness of our methods in improving LLMs’ soft constraint following ability and analyze the factors driving the improvements.

Subject: ACL.2025 - Findings

2025.findings-acl.1004@ACL

#1 Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models [PDF] [Copy] [Kimi] [REL]