Zhang_Mr._DETR_Instructive_Multi-Route_Training_for_Detection_Transformers@CVPR2025@CVF

Total: 1

#1 Mr. DETR: Instructive Multi-Route Training for Detection Transformers [PDF] [Copy] [Kimi] [REL]

Authors: Chang-Bin Zhang, Yujie Zhong, Kai Han

Existing methods enhance the training of detection transformers by incorporating an auxiliary one-to-many assignment.In this work, we treat the model as a multi-task framework, simultaneously performing one-to-one and one-to-many predictions.We investigate the roles of each component in the transformer decoder across these two training targets, including self-attention, cross-attention, and feed-forward network.Our empirical results demonstrate that any independent component in the decoder can effectively learn both targets simultaneously, even when other components are shared.This finding leads us to propose a multi-route training mechanism, featuring a primary route for one-to-one prediction and two auxiliary training routes for one-to-many prediction.We enhance the training mechanism with a novel instructive self-attention that dynamically and flexibly guides object queries for one-to-many prediction.The auxiliary routes are removed during inference, ensuring no impact on model architecture or inference cost.We conduct extensive experiments on various baselines, achieving consistent improvements.

Subject: CVPR.2025 - Poster