Total: 1
Recent advancements in Spoken Language Understanding (SLU) have been driven by pre-trained speech processing models. However, deploying these models on resource-constrained devices remains challenging due to their large parameter sizes. This paper presents PruneSLU, a new method for compressing pre-trained SLU models while maintaining performance. Our approach combines vocabulary pruning and structural layer-wise pruning to reduce model size while preserving essential knowledge. After pruning, the model undergoes knowledge refinement using integration distillation and contrastive learning. Experiments on the STOP and SLURP datasets demonstrate that PruneSLU compresses a 39M model to 15M while retaining 98\% of its original performance, outperforming previous compression techniques.