Can LLMs simulate the same correct solutions to free-response math problems as real students?

2025.emnlp-main.827@ACL

Total: 1

#1 Can LLMs simulate the same correct solutions to free-response math problems as real students? [PDF] [Copy] [Kimi] [REL]

Authors: Yuya Asano, Diane Litman, Erin Walker

Large language models (LLMs) have emerged as powerful tools for developing educational systems. While previous studies have explored modeling student mistakes, a critical gap remains in understanding whether LLMs can generate correct solutions that represent student responses to free-response problems. In this paper, we compare the distribution of solutions produced by four LLMs (one proprietary, two open-sourced general, and one open-sourced math models) with various sampling and prompting techniques and those generated by students, using conversations where students teach math problems to a conversational robot. Our study reveals discrepancies between the correct solutions produced by LLMs and by students. We discuss the practical implications of these findings for the design and evaluation of LLM-supported educational systems.

Subject: EMNLP.2025 - Main