Self-Generative Adversarial Fine-Tuning for Large Language Models

#1 Self-Generative Adversarial Fine-Tuning for Large Language Models [PDF¹] [Copy] [Kimi¹] [REL]

Authors: Shiguang Wu, Yaqing Wang, Quanming Yao

Fine-tuning large language models (LLMs) for alignment typically relies on supervised fine-tuning or reinforcement learning from human feedback, both limited by the cost and scarcity of high-quality annotations. Recent self-play and synthetic data approaches reduce this dependence but often rely on heuristic assumptions or ungrounded self-evaluation, which can cause bias accumulation and performance drift. In this paper, we propose Self-Generative Adversarial LLM (SGALM), a unified fine-tuning framework that formulates alignment as a generative adversarial game within a single LLM. SGALM jointly evolves generation and discrimination capabilities without external reward models. Theoretical and empirical results demonstrate that SGALM achieves state-of-the-art performance, serves as an effective alignment algorithm and a robust synthetic data engine.

Subject: Machine Learning

Publish: 2026-02-01 10:20:27 UTC

2602.01137

#1 Self-Generative Adversarial Fine-Tuning for Large Language Models [PDF1] [Copy] [Kimi1] [REL]

#1 Self-Generative Adversarial Fine-Tuning for Large Language Models [PDF¹] [Copy] [Kimi¹] [REL]