Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation

#1 Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation [PDF] [Copy] [Kimi] [REL]

Authors: Yu Wang, Jiaxin Zhang, Xiang Gao, Wendi Cui, Peng Li, Kamalika Das

In tasks such as summarization and open-book question answering (QA), Large Language Models (LLMs) frequently experience “contextual hallucination”, where they generate irrelevant or incorrect responses despite having access to accurate information in the input. This issue often stems from the models’ propensity to prioritize self-generated content over input context, leading to a disregard for pertinent details. To address this challenge, we introduce, Guided Attention Map Editing (GAME), an innovative approach that dynamically adjusts attention maps to enhance contextual relevance. During inference, GAME employs a trained classifier to identify attention maps likely to induce hallucinations and implements targeted interventions. These interventions, guided by gradient-informed “edit directions”, strategically redistribute attention weights across various heads to efficiently mitigate hallucination. Extensive evaluations on challenging summarization and open-book QA tasks demonstrate that GAME consistently and significantly reduces hallucinations across diverse open-source models, thereby improving the reliability and applicability of LLMs.

Subject: NAACL.2025 - Findings

2025.findings-naacl.458@ACL

#1 Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation [PDF] [Copy] [Kimi] [REL]