PRISM: Efficient Long-Range Reasoning With Short-Context LLMs

#1 PRISM: Efficient Long-Range Reasoning With Short-Context LLMs [PDF] [Copy] [Kimi] [REL]

Authors: Dulhan Jayalath, James Bradley Wendt, Nicholas Monath, Sandeep Tata, Beliz Gunel

Long-range tasks demand reasoning over long inputs. However, existing solutions are limited, e.g., long-context models require large compute budgets, parameter-efficient fine-tuning (PEFT) needs training data, and retrieval-augmented generation (RAG) entails complex task-specific designs. Though in-context approaches overcome many of these issues, methods with short-context LLMs are inefficient, trading context for processing more tokens. We introduce **PRISM**, a highly token-efficient in-context method based on structured schemas that outperforms baselines on diverse tasks with **4x shorter contexts**. This approach produces concise outputs and efficiently leverages key-value (KV) caches to **reduce costs by up to 54%**. PRISM scales down to tiny contexts without increasing costs or sacrificing quality, and generalizes to new tasks with minimal effort by generating schemas from task descriptions.

Subject: EMNLP.2025 - Main

2025.emnlp-main.517@ACL

#1 PRISM: Efficient Long-Range Reasoning With Short-Context LLMs [PDF] [Copy] [Kimi] [REL]