OpAGOfAhT0@OpenReview

Total: 1

#1 Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment [PDF] [Copy] [Kimi1] [REL]

Authors: Hua Ye, Hang Ding, Siyuan Chen, Yiyang Jiang, Zhang Changyuan, Xuan Zhang

Most multimodal models treat every negative pair alike, ignoring the ambiguous negatives that differ from the positive by only a small detail. We propose Boundary-A ware Curriculum with Local Attention(BACL), a lightweight add-on that turns these borderline cases into a curriculum signal. A Boundary-aware Negative Sampler gradually raises difficulty, while a Contrastive Local Attention loss highlights where the mismatch occurs. The two modules are fully differentiable and work with any off-the-shelf dual encoder. Theory predicts a fast $\tilde{\mathcal{O}}(1/n)$ error rate; practice shows up to +32 \% R@1 over CLIP and new SOTA on four large-scale benchmarks, all without extra labels.

Subject: NeurIPS.2025 - Poster