A Causal World Model Underlying Next Token Prediction: Exploring GPT in a Controlled Environment

#1 A Causal World Model Underlying Next Token Prediction: Exploring GPT in a Controlled Environment [PDF¹] [Copy] [Kimi¹] [REL]

Authors: Raanan Yehezkel Rohekar, Yaniv Gurwicz, Sungduk Yu, Estelle Aflalo Guez, Vasudev Lal

Are generative pre-trained transformer (GPT) models, trained only to predict the next token, implicitly learning a world model from which sequences are generated one token at a time? We address this question by deriving a causal interpretation of the attention mechanism in GPT and presenting a causal world model that arises from this interpretation. Furthermore, we propose that GPT models, at inference time, can be utilized for zero-shot causal structure learning for input sequences, and introduce a corresponding confidence score. Empirical tests were conducted in controlled environments using the setups of the Othello and Chess strategy games. A GPT, pre-trained on real-world games played with the intention of winning, was tested on out-of-distribution synthetic data consisting of sequences of random legal moves. We find that the GPT model is likely to generate legal next moves for out-of-distribution sequences for which a causal structure is encoded in the attention mechanism with high confidence. In cases where it generates illegal moves, it also fails to capture a causal structure.

Subject: ICML.2025 - Poster

qA3xHJzF6B@OpenReview

#1 A Causal World Model Underlying Next Token Prediction: Exploring GPT in a Controlled Environment [PDF1] [Copy] [Kimi1] [REL]

#1 A Causal World Model Underlying Next Token Prediction: Exploring GPT in a Controlled Environment [PDF¹] [Copy] [Kimi¹] [REL]