SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity

#1 SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity [PDF²] [Copy] [Kimi] [REL]

Authors: Samir Khaki, Xiuyu Li, Junxian Guo, Ligeng Zhu, Konstantinos N (Kostas) Plataniotis, Amir Yazdanbakhsh, Kurt Keutzer, Song Han, Zhijian Liu

Fine-tuning LLMs is both computationally andmemory-intensive. While parameter-efficient fine-tuning methods, such as QLoRA and DoRA,reduce the number of trainable parameters andlower memory usage, they do not decrease computational cost. In some cases, they may evenslow down fine-tuning. In this paper, we introduceSparseLoRA, a method that accelerates LLM fine-tuning through contextual sparsity. We proposea lightweight, training-free SVD sparsity estimator that dynamically selects a sparse subset ofweights for loss and gradient computation. Also,we systematically analyze and address sensitivityacross layers, tokens, and training steps. Our experimental results show that SparseLoRA reducescomputational cost by up to $2.0\times$ and a measuredspeedup of up to $1.5\times$ while maintaining accuracy across various downstream tasks, includingcommonsense and arithmetic reasoning, code generation, and instruction following.

Subject: ICML.2025 - Poster

z83rodY0Pw@OpenReview

#1 SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity [PDF2] [Copy] [Kimi] [REL]

#1 SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity [PDF²] [Copy] [Kimi] [REL]