Zero-knowledge LLM hallucination detection and mitigation through fine-grained cross-model consistency

#1 Zero-knowledge LLM hallucination detection and mitigation through fine-grained cross-model consistency [PDF] [Copy] [Kimi] [REL]

Authors: Aman Goel, Daniel Schwartz, Yanjun Qi

Large language models (LLMs) have demonstrated impressive capabilities across diverse tasks, but they remain susceptible to hallucinations—generating content that appears plausible but contains factual inaccuracies. We present Finch-Zk, a black-box framework that leverages fine-grained cross-model consistency to detect and mitigate hallucinations in LLM outputs without requiring external knowledge sources. Finch-Zk introduces two key innovations: 1) a cross-model consistency checking strategy that reveals fine-grained inaccuracies by comparing responses generated by diverse models from semantically-equivalent prompts, and 2) a targeted mitigation technique that applies precise corrections to problematic segments while preserving accurate content. Experiments on the FELM dataset show Finch-Zk improves hallucination detection F1 scores by 6-39% compared to existing approaches. For mitigation, Finch-Zk achieves up to 9 absolute percentage points improvement in answer accuracy on the GPQA-diamond dataset when applied to state-of-the-art models like Llama 4 Maverick and Claude 4 Sonnet. Extensive evaluation on multiple datasets demonstrates that Finch-Zk provides a practical, deployment-ready safeguard for enhancing factual reliability in production LLM systems.

Subject: EMNLP.2025 - Industry Track

2025.emnlp-industry.139@ACL

#1 Zero-knowledge LLM hallucination detection and mitigation through fine-grained cross-model consistency [PDF] [Copy] [Kimi] [REL]