2507.05499

Total: 1

#1 LoomNet: Enhancing Multi-View Image Generation via Latent Space Weaving [PDF2] [Copy] [Kimi3] [REL]

Authors: Giulio Federico, Fabio Carrara, Claudio Gennaro, Giuseppe Amato, Marco Di Benedetto

Generating consistent multi-view images from a single image remains challenging. Lack of spatial consistency often degrades 3D mesh quality in surface reconstruction. To address this, we propose LoomNet, a novel multi-view diffusion architecture that produces coherent images by applying the same diffusion model multiple times in parallel to collaboratively build and leverage a shared latent space for view consistency. Each viewpoint-specific inference generates an encoding representing its own hypothesis of the novel view from a given camera pose, which is projected onto three orthogonal planes. For each plane, encodings from all views are fused into a single aggregated plane. These aggregated planes are then processed to propagate information and interpolate missing regions, combining the hypotheses into a unified, coherent interpretation. The final latent space is then used to render consistent multi-view images. LoomNet generates 16 high-quality and coherent views in just 15 seconds. In our experiments, LoomNet outperforms state-of-the-art methods on both image quality and reconstruction metrics, also showing creativity by producing diverse, plausible novel views from the same input.

Subject: Computer Vision and Pattern Recognition

Publish: 2025-07-07 21:46:50 UTC