Post-Hoc Merging is Not Enough: Many-Shot Model Merging with Loss-Gap Balancing

#1 Post-Hoc Merging is Not Enough: Many-Shot Model Merging with Loss-Gap Balancing [PDF] [Copy] [Kimi²] [REL]

Authors: Kyungjin Im, Miru Kim, Chanin Eom, Minhae Kwon

Model merging has become a practical post-training strategy for building a single multi-task large language model (LLM) by combining multiple task-specialized models. However, most existing approaches rely on post-hoc merging, in which task-specific models are merged only once after training. This one-shot aggregation often suffers from task interference, leading to information erasure across individual tasks. In this work, we show that replacing post-hoc merging with an iterative many-shot merging protocol is effective in improving multi-task performance. Building on this insight, we propose METIS, Mitigating Erasure from Task Interference for Stable many-shot merging. METIS is a loss-aware many-shot merging method that addresses information erasure in post-hoc merging through task-wise loss-gap weighting and consensus-based masking. Notably, METIS exhibits significant performance improvement on the worst-performing task, effectively mitigating information erasure. (Project page: https://imkyungjin.github.io/METIS/)

Subject: Artificial Intelligence

Publish: 2026-06-15 10:03:01 UTC

2606.16501

#1 Post-Hoc Merging is Not Enough: Many-Shot Model Merging with Loss-Gap Balancing [PDF] [Copy] [Kimi2] [REL]

#1 Post-Hoc Merging is Not Enough: Many-Shot Model Merging with Loss-Gap Balancing [PDF] [Copy] [Kimi²] [REL]