The Role of Outgoing Connection Heterogeneity in Feedforward Layers of Large Language Models

2025.emnlp-main.1143@ACL

Total: 1

#1 The Role of Outgoing Connection Heterogeneity in Feedforward Layers of Large Language Models [PDF] [Copy] [Kimi] [REL]

Authors: Felix Stahlberg, Shankar Kumar

We report on investigations into the characteristics of outgoing connections in feedforward layers of large language models. Our findings show that inner neurons with diverse outgoing connection strengths are more critical to model performance than those with uniform connections. We propose a new fine-tuning loss that takes advantage of this observation by decreasing the outgoing connection entropy in feedforward layers. Using this loss yields gains over standard fine-tuning across two different model families (PaLM-2 and Gemma-2) for downstream tasks in math, coding, and language understanding. To further elucidate the role of outgoing connection heterogeneity, we develop a data-free structured pruning method, which uses entropy to identify and remove neurons. This method is considerably more effective than removing neurons either randomly or based on their magnitude.

Subject: EMNLP.2025 - Main