Self-ensemble: Mitigating Confidence Distortion for Large Language Models

#1 Self-ensemble: Mitigating Confidence Distortion for Large Language Models [PDF²] [Copy] [Kimi¹] [REL]

Authors: Zicheng Xu, Guanchu Wang, Guangyao Zheng, Yu-Neng Chuang, Alexander Szalay, Xia Hu, Vladimir Braverman

Although Large Language Models (LLMs) perform well in general fields, they exhibit a confidence distortion problem on multi-choice question-answering (MCQA), particularly as the number of answer choices increases. Specifically, on MCQA with many choices, LLMs suffer from under-confidence in correct predictions and over-confidence in incorrect ones, leading to a substantially degraded performance. To solve this problem, we propose Self-ensemble in this work. Our method splits the choices into several groups and ensembles LLM predictions across these groups to reach a final decision. The advantage of Self-ensemble is its plug-and-play nature, where it can be integrated into existing LLM architecture based on a designed attention mask and positional encoding, without requiring labeled datasets for parameter tuning. Experimental results on three LLMs and datasets demonstrate that Self-ensemble comprehensively addresses the confidence distortion problem of LLMs, outperforming standard inference as well as baseline methods.

Subjects: Computation and Language , Machine Learning

Publish: 2025-06-02 17:59:29 UTC

2506.01951

#1 Self-ensemble: Mitigating Confidence Distortion for Large Language Models [PDF2] [Copy] [Kimi1] [REL]

#1 Self-ensemble: Mitigating Confidence Distortion for Large Language Models [PDF²] [Copy] [Kimi¹] [REL]