Total: 1
When training data are limited, data-driven models are especially vulnerable to optimization-related fluctuations from random initialization and to sampling-induced bias from insufficient training data. We address both challenges with transfer learning (TL): deep neural networks (DNNs) are first pretrained on $α$ decay half-lives and then fine-tuned on a small cluster decay dataset. The pretraining stage provides a physically informed initialization that stabilizes optimization, while transferred global decay systematics regularize the fit and reduce sensitivity to training set composition. Despite extreme data sparsity, the resulting models accurately predict cluster decay half-lives for parent nuclei from $^{221}$Fr to $^{242}$Cm. We further quantify how initialization and sample selection affect predictive accuracy and robustness, demonstrating that TL enables stable and reliable learning in the small-sample regime.