IRQ0n961nn@OpenReview

Total: 1

#1 Learngene Tells You How to Customize: Task-Aware Parameter Initialization at Flexible Scales [PDF] [Copy] [Kimi] [REL]

Authors: Jiaze Xu, Shiyu Xia, Xu Yang, JIAQI LYU, Xin Geng

Appropriate parameter initialization strategies are essential for reducing the high computational costs of training large pretrained models in various task scenarios. Graph HyperNetwork (GHN), a parameter initialization method, has recently demonstrated strong performance in initializing models.However, GHN still faces several challenges, including limited effectiveness in initializing larger models, poor performance on smaller datasets, and the requirement of task-specific GHN training, where each new task necessitates retraining the GHN model, leading to increased computational and storage overhead.To overcome these challenges, motivated by the recently proposed Learngene framework, we propose a novel method called **T**ask-**A**ware **L**earngene (**TAL**). Briefly, our approach pretrains a TAL model under the guidance of a well-trained model and then performs multi-task tuning to obtain a shared TAL model that enables parameter prediction based on both model architectures and task-specific characteristics.Extensive experiments show the superiority of TAL.Models initialized with TAL outperform those initialized using GHN method by an average of 24.39\% in terms of accuracy across Decathlon datasets.

Subject: ICML.2025 - Poster