Model Stock: All we need is just a few fine-tuned models

#1 Model Stock: All we need is just a few fine-tuned models [PDF²⁵] [Copy] [Kimi²²] [REL]

Authors: Dong-Hwan Jang, Sangdoo Yun, Dongyoon Han

This paper introduces a novel fine-tuning method for large pre-trained models, offering strong performance with further efficiency. Breaking away from traditional practices that average a multitude of fine-tuned models for accuracy improvements, our approach uses significantly fewer models to optimize final weights yet achieve superior accuracy. Based on the crucial observations of the dynamics in fine-tuned models' weight space, our novel layer-wise averaging technique could surpass state-of-the-art model averaging methods such as Model Soup only with just two fine-tuned models. This strategy can be more aptly coined like Model Stock, reflecting its reliance on selecting very few models to draw a more optimized-averaged model. We demonstrate the efficacy of Model Stock with fine-tuned models based upon pre-trained CLIP architectures, achieving remarkable performance on both in-distribution (ID) and out-of-distribution (OOD) tasks on the standard benchmarks, all while barely bringing extra computational demands. Our code and pre-trained models will be made publicly available.

Subject: ECCV.2024 - Oral

6044@2024@ECCV

#1 Model Stock: All we need is just a few fine-tuned models [PDF25] [Copy] [Kimi22] [REL]

#1 Model Stock: All we need is just a few fine-tuned models [PDF²⁵] [Copy] [Kimi²²] [REL]