2601.21294

Total: 1

#1 Missing-Data-Induced Phase Transitions in Spectral PLS for Multimodal Learning [PDF2] [Copy] [Kimi] [REL]

Authors: Anders Gjølbye, Ida Kargaard, Emma Kargaard, Lars Kai Hansen

Partial Least Squares (PLS) learns shared structure from paired data via the top singular vectors of the empirical cross-covariance (PLS-SVD), but multimodal datasets often have missing entries in both views. We study PLS-SVD under independent entry-wise missing-completely-at-random masking in a proportional high-dimensional spiked model. After appropriate normalization, the masked cross-covariance behaves like a spiked rectangular random matrix whose effective signal strength is attenuated by $\sqrtρ$, where $ρ$ is the joint entry retention probability. As a result, PLS-SVD exhibits a sharp BBP-type phase transition: below a critical signal-to-noise threshold the leading singular vectors are asymptotically uninformative, while above it they achieve nontrivial alignment with the latent shared directions, with closed-form asymptotic overlap formulas. Simulations and semi-synthetic multimodal experiments corroborate the predicted phase diagram and recovery curves across aspect ratios, signal strengths, and missingness levels.

Subjects: Machine Learning , Machine Learning

Publish: 2026-01-29 05:46:44 UTC