2511.19739

Total: 1

#1 Comparative Analysis of LoRA-Adapted Embedding Models for Clinical Cardiology Text Representation [PDF] [Copy] [Kimi] [REL]

Authors: Richard J. Young, Alice M. Matthews

Domain-specific text embeddings are critical for clinical natural language processing, yet systematic comparisons across model architectures remain limited. This study evaluates ten transformer-based embedding models adapted for cardiology through Low-Rank Adaptation (LoRA) fine-tuning on 106,535 cardiology text pairs derived from authoritative medical textbooks. Results demonstrate that encoder-only architectures, particularly BioLinkBERT, achieve superior domain-specific performance (separation score: 0.510) compared to larger decoder-based models, while requiring significantly fewer computational resources. The findings challenge the assumption that larger language models necessarily produce better domain-specific embeddings and provide practical guidance for clinical NLP system development. All models, training code, and evaluation datasets are publicly available to support reproducible research in medical informatics.

Subjects: Computation and Language , Machine Learning

Publish: 2025-11-24 21:57:09 UTC