DebUnc: Improving Large Language Model Agent Communication With Uncertainty Metrics

2025.findings-emnlp.1265@ACL

Total: 1

#1 DebUnc: Improving Large Language Model Agent Communication With Uncertainty Metrics [PDF] [Copy] [Kimi] [REL]

Authors: Luke Yoffe, Alfonso Amayuelas, William Yang Wang

Multi-agent debates have been introduced to improve the accuracy of Large Language Models (LLMs) by having multiple agents discuss solutions to a problem over several rounds of debate. However, models often generate incorrect yet confident-sounding responses, which can mislead the others. This issue arises partly because agents do not consider how confident their peers are. To address this, we propose DebUnc, a debate framework that uses uncertainty metrics to assess agent confidence. Confidence is then conveyed through textual prompts or via a modified attention mechanism that adjusts token weights. Evaluations across benchmarks show that attention-based methods are particularly effective and that performance continues to improve as uncertainty estimation becomes more reliable. The code is available at https://github.com/lukeyoffe/debunc.

Subject: EMNLP.2025 - Findings