Jointly Conditioned Diffusion Model for Multi-View Pose-Guided Person Image Synthesis

#1 Jointly Conditioned Diffusion Model for Multi-View Pose-Guided Person Image Synthesis [PDF³] [Copy] [Kimi] [REL]

Authors: Chengyu Xie, Zhi Gong, Junchi Ren, Linkun Yu, Si Shen, Fei Shen, Xiaoyu Du

Pose-guided human image generation is limited by incomplete textures from single reference views and the absence of explicit cross-view interaction. We present jointly conditioned diffusion model (JCDM), a jointly conditioned diffusion framework that exploits multi-view priors. The appearance prior module (APM) infers a holistic identity preserving prior from incomplete references, and the joint conditional injection (JCI) mechanism fuses multi-view cues and injects shared conditioning into the denoising backbone to align identity, color, and texture across poses. JCDM supports a variable number of reference views and integrates with standard diffusion backbones with minimal and targeted architectural modifications. Experiments demonstrate state of the art fidelity and cross-view consistency.

Subject: Computer Vision and Pattern Recognition

Publish: 2025-11-19 04:05:39 UTC

2511.15092

#1 Jointly Conditioned Diffusion Model for Multi-View Pose-Guided Person Image Synthesis [PDF3] [Copy] [Kimi] [REL]

#1 Jointly Conditioned Diffusion Model for Multi-View Pose-Guided Person Image Synthesis [PDF³] [Copy] [Kimi] [REL]