2508.10894

Total: 1

#1 MAESTRO: Masked AutoEncoders for Multimodal, Multitemporal, and Multispectral Earth Observation Data [PDF2] [Copy] [Kimi2] [REL]

Authors: Antoine Labatie, Michael Vaccaro, Nina Lardiere, Anatol Garioud, Nicolas Gonthier

Self-supervised learning holds great promise for remote sensing, but standard self-supervised methods must be adapted to the unique characteristics of Earth observation data. We take a step in this direction by conducting a comprehensive benchmark of fusion strategies and normalization schemes of reconstruction targets for multimodal, multitemporal, and multispectral Earth observation data. Based on our findings, we introduce MAESTRO, a novel adaptation of the Masked Autoencoder with optimized fusion mechanisms and a normalization scheme that incorporates a spectral prior as a self-supervisory signal. Evaluated on four Earth observation datasets in both intra- and cross-dataset settings, MAESTRO achieves state-of-the-art performance on tasks that strongly rely on multitemporal dynamics, while also remaining competitive on others. Code to reproduce all our experiments is available at https://github.com/ignf/maestro.

Subject: Computer Vision and Pattern Recognition

Publish: 2025-08-14 17:58:45 UTC