Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems

#1 Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems [PDF] [Copy] [Kimi] [REL]

Authors: Aakriti Agrawal, Rohith Aralikatti, Anirudh Satheesh, Souradip Chakraborty, Amrit Singh Bedi, Furong Huang

Large Language Models (LLMs) have demonstrated exceptional capabilities, yet selecting the most reliable response from multiple LLMs remains a challenge, particularly in resource-constrained settings. Existing approaches often depend on costly external verifiers, human evaluators, or self-consistency techniques that require multiple samples from a single model. While multi-LLM systems produce more diverse responses than single models and thus have greater potential, they often underperform compared to single LLM self-consistency. In this work, we propose a calibrated log-likelihood-based selection framework to improve multi-LLM performance. Our approach leverages uncertainty estimation to identify the most confident response while minimizing inference costs. We show that our method outperforms majority voting and exceeds self-consistency performance when using a large number of model calls. Through extensive experiments, we demonstrate improvements of approx. 4%, 3%, and 5% on GSM8K, MMLU, and ARC, respectively, when applying uncertainty-aware selection to multi-LLM systems.

Subject: EMNLP.2025 - Findings

2025.findings-emnlp.1367@ACL

#1 Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems [PDF] [Copy] [Kimi] [REL]