x2Dw9aNbvw@OpenReview

Total: 1

#1 Heads up! Large Language Models Can Perform Tasks Without Your Instruction via Selective Attention Head Masking [PDF] [Copy] [Kimi] [REL]

Authors: Senyu Han, Hongchuan Zeng, Kai Yu, Lu Chen

Large language models (LLMs) consist of numerous Transformer modules, and while the models can perform various functions, it remains an open question of how these modules are combined to elicit distinct inherent functionalities. In this paper, we investigate the modules inside LLMs and demonstrate that, by simply masking or retaining specific attention heads during inference, LLMs can exhibit specific task functionalities without requiring explicit instructions or modifications to the model parameters. Experiments across various models and tasks reveal that LLMs inherently encode ``functional pathways'', the structured groups of interdependent attention heads that are crucial for executing specific tasks. These pathways not only govern the model's functional behaviors but also enhance parameter efficiency, as suppressing attention heads outside the pathway can improve task performance. The code is available in this repository: [https://github.com/OpenDFM/HeadsUp](https://github.com/OpenDFM/HeadsUp).

Subject: ICML.2025 - Poster