Entropy Meets Importance: A Unified Head Importance-Entropy Score for Stable and Efficient Transformer Pruning

#1 Entropy Meets Importance: A Unified Head Importance-Entropy Score for Stable and Efficient Transformer Pruning [PDF³] [Copy] [Kimi³] [REL]

Authors: Minsik Choi, Hyegang Son, Changhoon Kim, Young Geun Kim

Transformer-based models have achieved remarkable performance in NLP tasks. However, their structural characteristics-multiple layers and attention heads-introduce efficiency challenges in inference and deployment. To address these challenges, various pruning methods have recently been proposed. Notably, gradient-based methods using Head Importance Scores (HIS) have gained traction for interpretability, efficiency, and ability to identify redundant heads. However, HIS alone has limitations as it captures only the gradient-driven contribution, overlooking the diversity of attention patterns. To overcome these limitations, we introduce a novel pruning criterion, HIES (Head Importance-Entropy Score), which integrates head importance scores with attention entropy, providing complementary evidence on per-head contribution. Empirically, HIES-based pruning yields up to 15.2% improvement in model quality and 2.04x improvement in stability over HIS-only methods, enabling substantial model compression without sacrificing either accuracy or stability. Code will be released upon publication.

Subjects: Computation and Language , Artificial Intelligence , Machine Learning

Publish: 2025-10-10 12:08:20 UTC

2510.13832

#1 Entropy Meets Importance: A Unified Head Importance-Entropy Score for Stable and Efficient Transformer Pruning [PDF3] [Copy] [Kimi3] [REL]

#1 Entropy Meets Importance: A Unified Head Importance-Entropy Score for Stable and Efficient Transformer Pruning [PDF³] [Copy] [Kimi³] [REL]