42141@AAAI

Total: 1

#1 Theory-of-Mind in Partially Observed, Mixed-Motive Games [PDF] [Copy] [Kimi] [REL]

Author: Nitay Alon

Theory of Mind (ToM) enables agents to model others' mental states, but in mixed-motive games, this capacity can lead to deceptive behaviour and alignment risks. My research investigates how ToM affects strategic behaviour in partially observed games, contributing: (1) a formal model of ToM-driven manipulation in a preference elicitation task, (2) evidence that excessive ToM leads to paranoid-like overmentalisation, and (3) the Aleph-IPOMDP model, a framework for multi-agent systems that balances ToM reasoning with game-theoretic principles to prevent manipulation, deterring capable agents from deceiving. My work contributes to the understanding of deceptive AI, overcoming deception in multi-agent systems and applications to computational model of human cognition.

Subject: AAAI.2026 - Doctoral Consortium Track