DopQ-ViT: Towards Distribution-Friendly and Outlier-Aware Post-Training Quantization for Vision Transformers

#1 DopQ-ViT: Towards Distribution-Friendly and Outlier-Aware Post-Training Quantization for Vision Transformers [PDF⁶] [Copy] [Kimi¹⁰] [REL]

Authors: Lianwei Yang, Haisong Gong, Haokun Lin, Yichen Wu, Zhenan Sun, Qingyi Gu

Vision Transformers (ViTs) have gained significant attention, but their high computing cost limits the practical applications. While post-training quantization (PTQ) reduces model size and speeds up inference, it often degrades performance, especially in low-bit settings. We identify two key reasons for the performance degradation: 1) existing quantization methods fail to align with the power-law distribution of post-Softmax activations, and 2) reparameterizing post-LayerNorm activations leads to a performance drop due to the significant influence of outliers in the scaling factors. To address these challenges, we propose DopQ-ViT, a Distribution-friendly and Outlier-aware Post-training Quantization method for ViTs. First, DopQ-ViT introduces the Tan Quantizer (TanQ), which better preserves the power-law distribution of post-Softmax activations by focusing more on values near 1. Second, DopQ-ViT presents the MAD-guided Optimal Scaling Factor (MOSF), which selects the optimal scaling factor without introducing additional calculations. Extensive experiments across various ViT models and quantization settings demonstrate that DopQ-ViT, with the help of TanQ and MOSF, outperforms previous PTQ methods on both classification and detection tasks.

Subject: Computer Vision and Pattern Recognition

Publish: 2024-08-06 16:40:04 UTC

2408.03291

#1 DopQ-ViT: Towards Distribution-Friendly and Outlier-Aware Post-Training Quantization for Vision Transformers [PDF6] [Copy] [Kimi10] [REL]

#1 DopQ-ViT: Towards Distribution-Friendly and Outlier-Aware Post-Training Quantization for Vision Transformers [PDF⁶] [Copy] [Kimi¹⁰] [REL]