2601.11904

Total: 1

#1 Structure of Pitch-Pattern Motifs in Major League Baseball [PDF] [Copy] [Kimi] [REL]

Authors: Youngjai Park, Cheawoon Lim, Seung-Woo Son, Mi Jin Lee

Baseball consists of two teams alternating between batting and fielding while competing to score runs through sequential pitching events. Recent advances in tracking technology have enabled all Major League Baseball (MLB) clubs to record every pitch with high resolution, yet most quantitative studies have primarily emphasized single-pitch metrics, leaving the role of sequential structure less explored. Here, we examine pitch-pattern motifs of multiple lengths using approximately 12.4 million Statcast pitch recordings from the 2008-2025 MLB regular seasons at two complementary scales. At the macroscale, we quantify pitch-sequence diversity using the Shannon entropy and inverse Simpson index and examine their relationships with earned run average and win totals. At the microscale, we compare hit and out frequencies across pitch-pattern motifs. Rather than identifying outcome-determining sequences, we find that motif usage exhibits stable, non-random organization, as reflected in Zipf s and Heaps' laws, while showing limited association with conventional performance measures. While language-like scaling (Zipf's and Heaps' laws) clearly reveals an underlying 'grammar' of MLB pitch sequences, that grammar alone is insufficient to account for performance indicators such as ERA or wins. These results suggest that sequence-based analyses clarify the structural organization of pitch usage, while also delineating the limits of motif-based approaches for explaining performance without richer contextual information.

Subject: Physics and Society

Publish: 2026-01-17 04:33:35 UTC