Total: 1
Label completion serves as a preprocessing approach to handling the sparse crowdsourced label matrix problem, significantly boosting the effectiveness of the downstream label aggregation. In recent advances, worker modeling has been proved to be a powerful strategy to further improve the performance of label completion. However, in real-world scenarios, workers typically annotate only a few instances, leading to insufficient worker modeling and thus limiting the improvement of label completion. To address this issue, we propose a novel transfer learning-based label completion (TLLC) method. Specifically, we first identify all high-confidence instances from the whole crowdsourced data as a source domain and use it to pretrain a Siamese network. The abundant annotated instances in the source domain provide essential knowledge for worker modeling. Then, we transfer the pretrained network to the target domain with the instances annotated by each worker separately, ensuring worker modeling captures unique characteristics of each worker. Finally, we leverage the new embeddings learned by the transferred network to complete each worker’s missing labels. Extensive experiments on several widely used real-world datasets demonstrate the effectiveness of TLLC. Our codes and datasets are available at https://github.com/jiangliangxiao/TLLC.