Synthesizing Post-Training Data for LLMs through Multi-Agent Simulation

#1 Synthesizing Post-Training Data for LLMs through Multi-Agent Simulation [PDF²⁶] [Copy] [Kimi⁴¹] [REL]

Authors: Shuo Tang, Xianghe Pang, Zexi Liu, Bohan Tang, Rui Ye, Xiaowen Dong, Yanfeng Wang, Siheng Chen

Post-training is essential for enabling large language models (LLMs) to follow human instructions. Inspired by the recent success of using LLMs to simulate human society, we leverage multi-agent simulation to automatically generate diverse text-based scenarios, capturing a wide range of real-world human needs. We propose MATRIX, a multi-agent simulator that creates realistic and scalable scenarios. Leveraging these outputs, we introduce a novel scenario-driven instruction generator MATRIX-Gen for controllable and highly realistic data synthesis. Extensive experiments demonstrate that our framework effectively generates both general and domain-specific data. Notably, on AlpacaEval 2 and Arena-Hard benchmarks, Llama-3-8B-Base, post-trained on datasets synthesized by MATRIX-Gen with just 20K instruction-response pairs, outperforms Meta's Llama-3-8B-Instruct model, which was trained on over 10M pairs; see our project at https://github.com/ShuoTang123/MATRIX-Gen.

Subjects: Artificial Intelligence , Computation and Language

Publish: 2024-10-18 08:01:39 UTC

2410.14251

#1 Synthesizing Post-Training Data for LLMs through Multi-Agent Simulation [PDF26] [Copy] [Kimi41] [REL]

#1 Synthesizing Post-Training Data for LLMs through Multi-Agent Simulation [PDF²⁶] [Copy] [Kimi⁴¹] [REL]