2025.acl-long.410@ACL

Total: 1

#1 Exploiting the Shadows: Unveiling Privacy Leaks through Lower-Ranked Tokens in Large Language Models [PDF1] [Copy] [Kimi2] [REL]

Authors: Yuan Zhou, Zhuo Zhang, Xiangyu Zhang

Large language models (LLMs) play a crucial role in modern applications but face vulnerabilities related to the extraction of sensitive information. This includes unauthorized accesses to internal prompts and retrieval of personally identifiable information (PII) (e.g., in Retrieval-Augmented Generation based agentic applications). We examine these vulnerabilities in a question-answering (QA) setting where LLMs use retrieved documents or training knowledge as few-shot prompts. Although these documents remain confidential under normal use, adversaries can manipulate input queries to extract private content. In this paper, we propose a novel attack method by exploiting the model’s lower-ranked output tokens to leak sensitive information. We systematically evaluate our method, demonstrating its effectiveness in both the agentic application privacy extraction setting and the direct training data extraction. These findings reveal critical privacy risks in LLMs and emphasize the urgent need for enhanced safeguards against information leakage.

Subject: ACL.2025 - Long Papers