2603.21389

Total: 1

#1 Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models [PDF1] [Copy] [Kimi2] [REL]

Authors: Jinghan Cao, Yu Ma, Xinjin Li, Qingyang Ren, Xiangyun Chen

Large Language Models achieve remarkable performance but incur substantial computational costs unsuitable for resource-constrained deployments. This paper presents the first comprehensive task-specific efficiency analysis comparing 16 language models across five diverse NLP tasks. We introduce the Performance-Efficiency Ratio (PER), a novel metric integrating accuracy, throughput, memory, and latency through geometric mean normalization. Our systematic evaluation reveals that small models (0.5--3B parameters) achieve superior PER scores across all given tasks. These findings establish quantitative foundations for deploying small models in production environments prioritizing inference efficiency over marginal accuracy gains.

Subjects: Computation and Language , Machine Learning

Publish: 2026-03-22 20:19:45 UTC