PointDiffusion: Diffusion-Based Scene Completion in the Point Cloud Domain

#1 PointDiffusion: Diffusion-Based Scene Completion in the Point Cloud Domain [PDF¹] [Copy] [Kimi¹] [REL]

Authors: Chidera Agbasiere, Mikhail Sannikov, Faith Ogunwoye, Erik Shaikhiev, Alex Kozinov, Ilya Mikhalchuk, Iana Zhura, Dzmitry Tsetserukou

Reconstructing dense 3D scenes from sparse LiDAR point clouds is a fundamental challenge in autonomous driving, where latent diffusion models offer a promising solution. However, existing approaches rely on object-level autoencoders that collapse into unstable global representations at outdoor scale and suffer from ground truth data corrupted by odometry drift that systematically degrades supervision quality. Furthermore, multi-step diffusion inference incurs prohibitive latency for real-time deployment. We propose a novel multi-token Gaussian VAE with cross-attention pooling for stable scene-scale LiDAR compression, combined with an anchor-based ICP ground truth refinement pipeline that eliminates drift-induced noise from training supervision. Together, these components enable a scaffold-free single-step diffusion completion model that achieves an approximately 16x reduction in squared Chamfer distance on SemanticKITTI seq. 08 (0.396 m^2 to 0.024 m^2), surpasses LiDiff and ScoreLiDAR by 17-19% and 10-11%, respectively, and operates at 25-143x lower inference latency. Our results demonstrate that data quality dominates model design in this regime and that multi-token latent spaces provide a stable first stage for latent diffusion-based scene completion.

Subject: Computer Vision and Pattern Recognition

Publish: 2026-06-14 22:36:34 UTC

2606.16048

#1 PointDiffusion: Diffusion-Based Scene Completion in the Point Cloud Domain [PDF1] [Copy] [Kimi1] [REL]

#1 PointDiffusion: Diffusion-Based Scene Completion in the Point Cloud Domain [PDF¹] [Copy] [Kimi¹] [REL]