SH4FFEQ3Yv@OpenReview

Total: 1

#1 KIND: Knowledge Integration and Diversion for Training Decomposable Models [PDF] [Copy] [Kimi] [REL]

Authors: Yucheng Xie, Fu Feng, Ruixiao Shi, Jing Wang, Yong Rui, Xin Geng

Pre-trained models have become the preferred backbone due to the increasing complexity of model parameters. However, traditional pre-trained models often face deployment challenges due to their fixed sizes, and are prone to negative transfer when discrepancies arise between training tasks and target tasks.To address this, we propose **KIND**, a novel pre-training method designed to construct decomposable models.KIND integrates knowledge by incorporating Singular Value Decomposition (SVD) as a structural constraint, with each basic component represented as a combination of a column vector, singular value, and row vector from $U$, $\Sigma$, and $V^\top$ matrices.These components are categorized into **learngenes** for encapsulating class-agnostic knowledge and \textbf{tailors} for capturing class-specific knowledge, with knowledge diversion facilitated by a class gate mechanism during training.Extensive experiments demonstrate that models pre-trained with KIND can be decomposed into learngenes and tailors, which can be adaptively recombined for diverse resource-constrained deployments. Moreover, for tasks with large domain shifts, transferring only learngenes with task-agnostic knowledge, when combined with randomly initialized tailors, effectively mitigates domain shifts.Code will be made available at https://github.com/Te4P0t/KIND.

Subject: ICML.2025 - Poster