Efficient Heterogeneity-Aware Federated Active Data Selection

#1 Efficient Heterogeneity-Aware Federated Active Data Selection [PDF] [Copy] [Kimi¹] [REL]

Authors: Yingpeng Tang, Chao Ren, Xiaoli Tang, Sheng-Jun Huang, Lizhen Cui, Han Yu

Federated Active Learning (FAL) aims to learn an effective global model, while minimizing label queries. Owing to privacy requirements, it is challenging to design effective active data selection schemes due to the lack of cross-client query information. In this paper, we bridge this important gap by proposing the \underline{F}ederated \underline{A}ctive data selection by \underline{LE}verage score sampling (FALE) method. It is designed for regression tasks in the presence of non-i.i.d. client data to enable the server to select data globally in a privacy-preserving manner. Based on FedSVD, FALE aims to estimate the utility of unlabeled data and perform data selection via leverage score sampling. Besides, a secure model learning framework is designed for federated regression tasks to exploit supervision. FALE can operate without requiring an initial labeled set and select the instances in a single pass, significantly reducing communication overhead. Theoretical analyze establishes the query complexity for FALE to achieve constant factor approximation and relative error approximation. Extensive experiments on 11 benchmark datasets demonstrate significant improvements of FALE over existing state-of-the-art methods.

Subject: ICML.2025 - Poster

pSdWTED0ZZ@OpenReview

#1 Efficient Heterogeneity-Aware Federated Active Data Selection [PDF] [Copy] [Kimi1] [REL]

#1 Efficient Heterogeneity-Aware Federated Active Data Selection [PDF] [Copy] [Kimi¹] [REL]