Rectifying Belief Space via Unlearning to Harness LLMs’ Reasoning

2025.findings-acl.1285@ACL

Total: 1

#1 Rectifying Belief Space via Unlearning to Harness LLMs’ Reasoning [PDF] [Copy] [Kimi] [REL]

Authors: Ayana Niwa, Masahiro Kaneko, Kentaro Inui

Large Language Models (LLMs) exhibit sophisticated reasoning yet still generate incorrect answers. We attribute these errors to **Spurious Beliefs**, defined as propositions the model internally considers as true despite being factually false. To reduce reasoning errors, we propose a belief space rectification framework. Our method first identifies the beliefs invoked during inference via an explanation‐based approach with Forward‐Backward Beam Search (FBBS). We subsequently apply unlearning via gradient ascent to suppress spurious beliefs and enhance true ones, thereby effectively rectifying the model’s belief space. Experiments on three QA datasets and three LLMs show that our method significantly reduces erroneous reasoning and improves generalization.

Subject: ACL.2025 - Findings