asQJx56NqB@OpenReview

Total: 1

#1 Position: We Can’t Understand AI Using our Existing Vocabulary [PDF3] [Copy] [Kimi3] [REL]

Authors: John Hewitt, Robert Geirhos, Been Kim

This position paper argues that, in order to understand AI, we cannot rely on our existing vocabulary of human words. Instead, we shouldstrive to develop neologisms: new words thatrepresent precise human concepts that we wantto teach machines, or machine concepts that weneed to learn. We start from the premise thathumans and machines have differing concepts.This means interpretability can be framed as acommunication problem: humans must be able toreference and control machine concepts, and communicate human concepts to machines. Creatinga shared human-machine language through developing neologisms, we believe, could solve thiscommunication problem. Successful neologismsachieve a useful amount of abstraction: not toodetailed, so they’re reusable in many contexts, andnot too high-level, so they convey precise information. As a proof of concept, we demonstrate howa “length neologism” enables controlling LLMresponse length, while a “diversity neologism” allows sampling more variable responses. Takentogether, we argue that we cannot understand AIusing our existing vocabulary, and expanding itthrough neologisms creates opportunities for bothcontrolling and understanding machines better.

Subject: ICML.2025 - Spotlight