RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation

#1 RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation [PDF¹] [Copy] [Kimi²] [REL]

Authors: Xiaoxi Li, Jiajie Jin, Yujia Zhou, Yongkang Wu, Zhonghua Li, Ye Qi, Zhicheng Dou

Large language models (LLMs) exhibit remarkable generative capabilities but often suffer from hallucinations. Retrieval-augmented generation (RAG) offers an effective solution by incorporating external knowledge, but existing methods still face several limitations: additional deployment costs of separate retrievers, redundant input tokens from retrieved text chunks, and the lack of joint optimization of retrieval and generation. To address these issues, we propose RetroLLM, a unified framework that integrates retrieval and generation into a single, auto-regressive process, enabling LLMs to directly generate fine-grained evidence from the corpus with constrained decoding. Moreover, to mitigate false pruning in the process of constrained evidence generation, we introduce (1) hierarchical FM-Index constraints, which generate corpus-constrained clues to identify a subset of relevant documents before evidence generation, reducing irrelevant decoding space; and (2) a forward-looking constrained decoding strategy, which considers the relevance of future sequences to improve evidence accuracy. Extensive experiments on five open-domain QA datasets demonstrate RetroLLM’s superior performance across both in-domain and out-of-domain tasks. The code is available at https://anonymous.4open.science/r/RetroLLM-D95A.

Subject: ACL.2025 - Long Papers

2025.acl-long.819@ACL

#1 RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation [PDF1] [Copy] [Kimi2] [REL]

#1 RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation [PDF¹] [Copy] [Kimi²] [REL]