Investigating Value-Reasoning Reliability in Small Large Language Models

#1 Investigating Value-Reasoning Reliability in Small Large Language Models [PDF] [Copy] [Kimi] [REL]

Authors: Xia Du, Shuhan Sun, Pengyuan Liu, Dong Yu

Although small Large Language models (sLLMs) have been widely deployed in practical applications, little attention has been paid to their value-reasoning abilities, particularly in terms of reasoning reliability. To address this gap, we propose a systematic evaluation framework for assessing the Value-Reasoning Reliability of sLLMs. We define Value-Reasoning Reliability as comprising: (1) Output consistency under identical prompts, (2) Output Robustness under semantically equivalent prompts, (3) Maintaining stable value reasoning in the face of attacks, and (4) Consistency of value reasoning in open-ended value expression tasks. Our framework includes three core tasks: Repetition Consistency task, Interaction Stability task, and Open-ended Expression Consistency task. We further incorporate self-reported confidence scores to evaluate the model’s value reasoning reliability from two perspectives: the model’s self-awareness of its values, and its value-based decision-making. Our findings show that models vary significantly in their stability when responding to value-related questions. Moreover, we observe considerable output randomness, which is not always correlated with the self-reported confidence or expressed value preferences. This suggests that current models lack a reliable internal mechanism for stable value reasoning when addressing value-sensitive queries.

Subject: EMNLP.2025 - Main

2025.emnlp-main.395@ACL

#1 Investigating Value-Reasoning Reliability in Small Large Language Models [PDF] [Copy] [Kimi] [REL]