89efa87dc8f0a5d18e4ae0a479658f60@2023@MLSYS

Total: 1

#1 ApproxCaliper: A Programmable Framework for Application-aware Neural Network Optimization [PDF1] [Copy] [Kimi2] [REL]

Authors: Yifan Zhao ; Hashim Sharif ; Peter Pao-Huang ; Vatsin Shah ; Arun Narenthiran Sivakumar ; Mateus Valverde Gasparino ; Abdulrahman Mahmoud ; Nathan Zhao ; Sarita Adve ; Girish Chowdhary ; Sasa Misailovic ; Vikram Adve

To deploy compute-intensive neural networks on resource-constrained edge systems, developers use model optimization techniques that reduce model size and computational cost. Existing optimization tools are application-agnostic -- they optimize model parameters solely in view of the neural network accuracy -- and can thus miss optimization opportunities. We propose ApproxCaliper, the first programmable framework for application-aware neural network optimization. By incorporating application-specific goals, ApproxCaliper facilitates more aggressive optimization of the neural networks compared to application-agnostic techniques. We perform experiments on five different neural networks used in two real-world robotics systems: a commercial agriculture robot and a simulation of an autonomous electric cart. Compared to Learning Rate Rewinding (LRR), a state-of-the-art structured pruning tool used in an application-agnostic setting, ApproxCaliper achieves 5.3x higher speedup and 2.9x lower GPU resource utilization, and 36x and 6.1x additional model size reduction, respectively.