Total: 1
We introduce GraviBERT, a novel deep learning framework for inference of gravitational-wave time series, which relies on an Inception-inspired multi-scale convolutional feature extractor combined with a transformer encoder and a suitable regression head. GraviBERT is trained in two stages: a BERT-style pretraining phase, in which the model learns to predict masked segments in feature space to capture universal patterns and physics, followed by supervised fine-tuning for accurate parameter estimation. This approach demonstrates impressive improvement across multiple metrics compared to training from scratch. On in-domain data, it reduces the mean absolute error for point-estimate parameter inference by up to $30\%$, and training convergence accelerates by up to a factor of six. Moreover, at low signal-to-noise ratio, the mean relative precision of the inferred masses and distances reaches the few-percent level, while the mean absolute error in the effective spin is about $10^{-3}$. For domain adaptation to new detector noise profiles, the pretrained model demonstrates remarkable efficiency, converging up to $15\times$ faster on small target datasets and reducing estimation errors by up to approximately $45\%$, indicating that it learns sufficient detector-agnostic representations. Cross-approximant transfer demonstrates comparable performance, achieving up to $44\%$ reductions in mean absolute error across all parameters and up to $15\times$ training speedups, with $R^2$ scores consistently exceeding 0.9 for mass parameters at signal-to-noise ratio 10, compared to 0.74 - 0.87 when training from scratch. Notably, GraviBERT works directly with noisy waveforms. The final regression head of the model can be adapted for a range of downstream tasks after pretraining, positioning it as a step towards foundation-style models in gravitational-wave and multi-messenger astronomy.