RAG-Coding: Enhancing LLM Medical Coding with Structured External Knowledge

#1 RAG-Coding: Enhancing LLM Medical Coding with Structured External Knowledge [PDF] [Copy] [Kimi] [REL]

Authors: Yidong Gan, David D. Nguyen, Yang Lin, Peter Zhong, Thanh Vu, Long Duong, Yuan-Fang Li

We present RAG-Coding, an agentic method for automated ICD-10-CM coding. RAG-Coding orchestrates four large language model (LLM) agents and grounds their coding decisions in external knowledge sources (e.g. the official coding tabular list and guidelines). By retrieving and cross-referencing relevant knowledge in these sources, the agents enhance coding accuracy and ensure clinical compliance. On the MDACE dataset, RAG-Coding outperforms the best LLM-based baseline by 8-13\% in micro-F1 and 2-8\% in macro-F1 across multiple LLM backbones. Compared to the state-of-the-art pretrained language model method, PLM-ICD, RAG-Coding exhibits higher micro recall (+11\%), while PLM-ICD exhibits higher micro precision (+6\%), yielding comparable micro- and macro-F1. Ablations show stepwise gains, highlighting the importance of incorporating external knowledge. We also release MDACE-2025, updating the original dataset with expert re-annotations with the latest 2025 ICD-10-CM guidelines. This update features more fine-grained code labels and enables evaluation against current clinical standards.

Subjects: Computation and Language , Artificial Intelligence , Information Retrieval

Publish: 2026-04-09 06:27:03 UTC

2605.27377

#1 RAG-Coding: Enhancing LLM Medical Coding with Structured External Knowledge [PDF] [Copy] [Kimi] [REL]