Total: 1
Humans excel at understanding and reasoning about novel, compositionally structured knowledge, largely due to their capacity for compositional generalization—a cognitive skill that has recently been validated in structured neural networks. However, most existing research has focused primarily on semantic translation within canonical language environments, often neglecting the explicit connection to compositional generalization behavior. In contrast, humans typically demonstrate this ability through interaction with their environments rather than solely through internal reasoning. To address this gap, we propose CraftFactory, a benchmark designed for evaluating compositional generalization in an interactive control environment. This benchmark introduces a new challenge for testing compositional generalization in a more realistic and comprehensive manner. CraftFactory stands out due to three key features: (1) it offers an open-ended interactive control environment with thousands of items and flexible actions; (2) it requires advanced compositional inference through various combinations and complex permutations of instructions; and (3) it evaluates compositional generalization intuitively through interactive behavior. By leveraging CraftFactory, we aim to promote the development of more advanced compositional generalization methods, thereby contributing to the broader field of general AI.