2505.20282

Total: 1

#1 One-shot Entropy Minimization [PDF33] [Copy] [Kimi46] [REL]

Authors: Zitian Gao, Lynx Chen, Joey Zhou, Bryan Dai

We trained 13,440 large language models and found that entropy minimization requires only a single unlabeled data and 10 steps optimization to achieve performance improvements comparable to or even greater than those obtained using thousands of data and carefully designed rewards in rule-based reinforcement learning. This striking result may prompt a rethinking of post-training paradigms for large language models. Our code is avaliable at https://github.com/zitian-gao/one-shot-em.

Subject: Computation and Language

Publish: 2025-05-26 17:58:30 UTC