Point-MaDi: Masked Autoencoding with Diffusion for Point Cloud Pre-training

#1 Point-MaDi: Masked Autoencoding with Diffusion for Point Cloud Pre-training [PDF²] [Copy] [Kimi] [REL]

Authors: Xiaoyang Xiao, Runzhao Yao, Zhiqiang Tian, Shaoyi Du

Self-supervised pre-training is essential for 3D point cloud representation learning, as annotating their irregular, topology-free structures is costly and labor-intensive. Masked autoencoders (MAEs) offer a promising framework but rely on explicit positional embeddings, such as patch center coordinates, which leak geometric information and limit data-driven structural learning. In this work, we propose Point-MaDi, a novel Point cloud Masked autoencoding Diffusion framework for pre-training that integrates a dual-diffusion pretext task into an MAE architecture to address this issue. Specifically, we introduce a center diffusion mechanism in the encoder, noising and predicting the coordinates of both visible and masked patch centers without ground-truth positional embeddings. These predicted centers are processed using a transformer with self-attention and cross-attention to capture intra- and inter-patch relationships. In the decoder, we design a conditional patch diffusion process, guided by the encoder's latent features and predicted centers to reconstruct masked patches directly from noise. This dual-diffusion design drives comprehensive global semantic and local geometric representations during pre-training, eliminating external geometric priors. Extensive experiments on ScanObjectNN, ModelNet40, ShapeNetPart, S3DIS, and ScanNet demonstrate that Point-MaDi achieves superior performance across downstream tasks, surpassing Point-MAE by 5.51\% on OBJ-BG, 5.17\% on OBJ-ONLY, and 4.34\% on PB-T50-RS for 3D object classification on the ScanObjectNN dataset.

Subject: NeurIPS.2025 - Poster

sYeE1obXGG@OpenReview

#1 Point-MaDi: Masked Autoencoding with Diffusion for Point Cloud Pre-training [PDF2] [Copy] [Kimi] [REL]

#1 Point-MaDi: Masked Autoencoding with Diffusion for Point Cloud Pre-training [PDF²] [Copy] [Kimi] [REL]