LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress?

#1 LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress? [PDF] [Copy] [Kimi¹] [REL]

Authors: Teddy Foley, Spencer Guo, Henry Josephson, Anqi Qu, Jack Sanderson

This paper examines whether large language model (LLM) capabilities can continue to advance without additional compute by analyzing the development and role of algorithms used in state-of-the-art LLMs. Motivated by regulatory efforts that have largely focused on restricting access to high-performance hardware, we ask: Can LLMs progress in a compute-constrained environment, and how do algorithmic innovations perform under such conditions? To address these questions, we introduce a novel classification framework that distinguishes between compute-dependent innovations -- which yield disproportionate benefits at high compute levels (e.g., the Transformer architecture and mixture-of-experts models) and compute-independent innovations, which improve efficiency across all compute scales (e.g., rotary positional encoding, FlashAttention, or layer normalization). We quantify these contributions using a metric called compute-equivalent gain (CEG), which estimates the additional compute that would be required to achieve similar improvements without these algorithmic advancements. To validate this framework, we conduct small-scale training experiments with a scaled-down GPT-2 model. Our results confirm that compute-independent advancements yield meaningful performance gains even in resource-constrained settings, with a CEG of up to $3.5\times$ over a baseline model. By contrast, compute-dependent advancements provided little benefit or even degraded performance at the small scale, reinforcing the importance of compute availability for certain algorithmic gains.

Subjects: Machine Learning , Artificial Intelligence

Publish: 2025-05-07 02:26:17 UTC

2505.04075

#1 LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress? [PDF] [Copy] [Kimi1] [REL]

#1 LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress? [PDF] [Copy] [Kimi¹] [REL]