Che_Hallucinatory_Image_Tokens_A_Training-free_EAZY_Approach_to_Detecting_and@ICCV2025@CVF

Total: 1

#1 Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs [PDF] [Copy] [Kimi] [REL]

Authors: Liwei Che, Tony Qingze Liu, Jing Jia, Weiyi Qin, Ruixiang Tang, Vladimir Pavlovic

Despite their remarkable potential, Large Vision-Language Models (LVLMs) still face challenges with object hallucination, a problem where their generated outputs mistakenly incorporate objects that do not actually exist. Although most works focus on addressing this issue within the language-model backbone, our work shifts the focus to the image input source, investigating how specific image tokens contribute to hallucinations. Our analysis reveals that a small subset of image tokens with high attention scores are the main drivers of object hallucination. By removing these hallucinatory image tokens (only 1.5% of all image tokens), the issue can be effectively mitigated. This finding holds consistently across different models. Building on this insight, we introduce \eazy, a novel, training-free method that automatically identifies and Eliminates hAllucinations by Zeroing out hallucinator Y image tokens. We utilize EAZY for unsupervised object hallucination detection, achieving a 15% improvement compared to previous methods. Additionally, EAZY demonstrates remarkable effectiveness in mitigating hallucinations while preserving model utility and seamlessly adapting to various LVLM architectures.

Subject: ICCV.2025 - Poster