Sample-wise Adaptive Weighting for Transfer Consistency in Adversarial Distillation

#1 Sample-wise Adaptive Weighting for Transfer Consistency in Adversarial Distillation [PDF] [Copy] [Kimi] [REL]

Adversarial distillation in the standard min-max adversarial training framework aims to transfer adversarial robustness from a large, robust teacher network to a compact student. However, existing work often neglects to incorporate state-of-the-art robust teachers. Through extensive analysis, we find that stronger teachers do not necessarily yield more robust students-a phenomenon known as robust saturation. While typically attributed to capacity gaps, we show that such explanations are incomplete. Instead, we identify adversarial transferability-the fraction of student-crafted adversarial examples that remain effective against the teacher-as a key factor in successful robustness transfer. Based on this insight, we propose Sample-wise Adaptive Adversarial Distillation (SAAD), which reweights training examples by their measured transferability without incurring additional computational cost. Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet show that SAAD consistently improves AutoAttack robustness over prior methods. Our code is available at https://github.com/HongsinLee/saad.

Subject: Computer Vision and Pattern Recognition

Publish: 2025-12-11 04:31:04 UTC

2512.10275

#1 Sample-wise Adaptive Weighting for Transfer Consistency in Adversarial Distillation [PDF] [Copy] [Kimi] [REL]