2601.14087

Total: 1

#1 '1'-bit Count-based Sorting Unit to Reduce Link Power in DNN Accelerators [PDF] [Copy] [Kimi] [REL]

Authors: Ruichi Han, Yizhi Chen, Tong Lei, Jordi Altayo Gonzalez, Ahmed Hemani

Interconnect power consumption remains a bottleneck in Deep Neural Network (DNN) accelerators. While ordering data based on '1'-bit counts can mitigate this via reduced switching activity, practical hardware sorting implementations remain underexplored. This work proposes the hardware implementation of a comparison-free sorting unit optimized for Convolutional Neural Networks (CNN). By leveraging approximate computing to group population counts into coarse-grained buckets, our design achieves hardware area reductions while preserving the link power benefits of data reordering. Our approximate sorting unit achieves up to 35.4% area reduction while maintaining 19.50\% BT reduction compared to 20.42% of precise implementation.

Subjects: Hardware Architecture , Artificial Intelligence , Machine Learning

Publish: 2026-01-20 15:47:36 UTC