Towards Robust Human–AI Decision-Making via Learning-to-Defer

#1 Towards Robust Human–AI Decision-Making via Learning-to-Defer [PDF] [Copy] [Kimi] [REL]

AI systems often fail on challenging or out-of-distribution inputs—a critical limitation in domains such as healthcare, finance, and autonomous driving. Learning to Defer (L2D) addresses this by training models not only to predict but also to decide when to defer to external experts. This thesis develops a unified and robust framework for L2D that advances its theoretical foundations, reliability, and applicability. It characterizes Bayes-optimal routing policies, establishes surrogate-consistency guarantees, and introduces a unified adversarial framework for attacking and defending L2D with Bayes-optimal robustness. It further proposes the first top-k deferral methods in both two-stage and one-stage settings. Empirical studies validate these ideas in multi-task learning and extractive question answering with large language models. Ongoing work explores token-level routing in LLMs, online adaptation with dynamic experts, and partial deferral.

Subject: AAAI.2026 - Doctoral Consortium Track

42160@AAAI

#1 Towards Robust Human–AI Decision-Making via Learning-to-Defer [PDF] [Copy] [Kimi] [REL]