Low-latency Federated LLM Fine-tuning Over Wireless Networks

#1 Low-latency Federated LLM Fine-tuning Over Wireless Networks [PDF] [Copy] [Kimi] [REL]

Authors: Zhiwen Pang, Kang Wei, Long Shi, Zhe Wang, Jun Li, Feng Shu

Recently, federated large language models (LLMs) have drawn significant attention thanks to coupled capabilities of LLMs and federated learning (FL) that address privacy concerns in collaborative fine-tuning. However, due to large-scale parameters of LLMs, existing federated LLM fine-tuning frameworks incur significant challenges in resource-constrained clients characterized by heterogeneous computing capabilities and random wireless channels. To address this issue, we propose a joint client-specific pruning and bandwidth allocation (JCPBA) framework for federated LLMs to improve the fine-tuning efficiency over the wireless networks. Specifically, we formulate a fine-tuning latency minimization problem by jointly optimizing pruning rates and bandwidth allocations. Furthermore, we solve this optimization problem using a block coordinate descent method. Extensive experiments on the datasets of Yahoo Answers and GSM8K demonstrate that the proposed framework significantly reduces wall-clock fine-tuning time compared with state-of-the-art baselines and gains equal or lower test loss at the cost of lower computation and communication overhead.

Subject: Distributed, Parallel, and Cluster Computing

Publish: 2026-02-01 05:15:50 UTC

2602.01024

#1 Low-latency Federated LLM Fine-tuning Over Wireless Networks [PDF] [Copy] [Kimi] [REL]