2025.findings-emnlp.459@ACL

Total: 1

#1 DAPE-BR: Distance-Aware Positional Encoding for Mitigating Object Hallucination in LVLMs [PDF] [Copy] [Kimi] [REL]

Authors: Mingrui Xie, Tianxiang Xu, Qianhai Tang, Shanming Yao, Xiaofeng Zhang, Junliang Du

Large Vision–Language Models (LVLMs) have garnered substantial interest owing to their impressive ability to interpret visual inputs and converse with users.Nevertheless, LVLMs still suffer from object hallucination – generating descriptions for objects that are absent from the image, which undermines reliability and hinders real-world deployment. We propose DAPE-BR, a positional-alignment scheme that (i) preserves the pretrained weight order while globally—- visual–text distances, (ii) embeds an isotropic fused patch-distance metric, and (iii) applies a patch-distance causal mask to enforce spatial causality. Extensive experiments on POPE, MMStar and SQA show that DAPE-BR consistently reduces hallucinations and boosts.

Subject: EMNLP.2025 - Findings