2604.14815

Total: 1

#1 Domain Fine-Tuning FinBERT on Finnish Histopathological Reports: Train-Time Signals and Downstream Correlations [PDF] [Copy] [Kimi1] [REL]

Authors: Rami Luisto, Liisa Petäinen, Tommi Grönholm, Jan Böhm, Maarit Ahtiainen, Tomi Lilja, Ilkka Pölönen, Sami Äyrämö

In NLP classification tasks where little labeled data exists, domain fine-tuning of transformer models on unlabeled data is an established approach. In this paper we have two aims. (1) We describe our observations from fine-tuning the Finnish BERT model on Finnish medical text data. (2) We report on our attempts to predict the benefit of domain-specific pre-training of Finnish BERT from observing the geometry of embedding changes due to domain fine-tuning. Our driving motivation is the common\situation in healthcare AI where we might experience long delays in acquiring datasets, especially with respect to labels.

Subject: Computation and Language

Publish: 2026-04-16 09:36:48 UTC