ZM5usuLxur@OpenReview

Total: 1

#1 MARGE: Improving Math Reasoning with Guided Exploration [PDF] [Copy] [Kimi] [REL]

Authors: Jingyue Gao, Runji Lin, Keming Lu, Bowen Yu, Junyang Lin, Jianyu Chen

Large Language Models (LLMs) exhibit strong potential in mathematical reasoning, yet their effectiveness is often limited by a shortage of high-quality queries.This limitation necessitates scaling up computational responses through self-generated data, yet current methods struggle due to spurious correlated data caused by ineffective exploration across all reasoning stages.To address such challenge, we introduce **MARGE**: Improving **Ma**th **R**easoning with **G**uided **E**xploration, a novel method that enhances mathematical reasoning through hit-guided exploration.MARGE systematically explores intermediate reasoning states derived from self-generated solutions, enabling adequate exploration and improved credit assignment throughout the reasoning process.Notably, MARGE improves both single-shot accuracy and exploration diversity, mitigating a common trade-off in alignment methods.These results demonstrate MARGE's effectiveness in enhancing mathematical reasoning capabilities and unlocking the potential of scaling self-generated training data.

Subject: ICML.2025 - Poster