Building Patient Journeys in Hebrew: A Language Model for Clinical Timeline Extraction

#1 Building Patient Journeys in Hebrew: A Language Model for Clinical Timeline Extraction [PDF] [Copy] [Kimi¹] [REL]

Authors: Kai Golan Hashiloni, Brenda Kasabe Nokai, Michal Shevach, Esthy Shemesh, Ronit Bartin, Anna Bergrin, Liran Harel, Nachum Dershowitz, Liat Nadai Arad, Kfir Bar

We present a new Hebrew medical language model designed to extract structured clinical timelines from electronic health records, enabling the construction of patient journeys. Our model is based on DictaBERT 2.0 and continually pre-trained on over five million de-identified hospital records. To evaluate its effectiveness, we introduce two new datasets -- one from internal medicine and emergency departments, and another from oncology -- annotated for event temporal relations. Our results show that our model achieves strong performance on both datasets. We also find that vocabulary adaptation improves token efficiency and that de-identification does not compromise downstream performance, supporting privacy-conscious model development. The model is made available for research use under ethical restrictions.

Subject: Computation and Language

Publish: 2025-12-12 11:54:50 UTC

2512.11502

#1 Building Patient Journeys in Hebrew: A Language Model for Clinical Timeline Extraction [PDF] [Copy] [Kimi1] [REL]

#1 Building Patient Journeys in Hebrew: A Language Model for Clinical Timeline Extraction [PDF] [Copy] [Kimi¹] [REL]