ArtGen: Conditional Generative Modeling of Articulated Objects in Arbitrary Part-Level States

#1 ArtGen: Conditional Generative Modeling of Articulated Objects in Arbitrary Part-Level States [PDF²] [Copy] [Kimi] [REL]

Authors: Haowen Wang, Xiaoping Yuan, Fugang Zhang, Rui Jian, Yuanwei Zhu, Xiuquan Qiao, Yakun Huang

Generating articulated assets is crucial for robotics, digital twins, and embodied intelligence. Existing generative models often rely on single-view inputs representing closed states, resulting in ambiguous or unrealistic kinematic structures due to the entanglement between geometric shape and joint dynamics. To address these challenges, we introduce ArtGen, a conditional diffusion-based framework capable of generating articulated 3D objects with accurate geometry and coherent kinematics from single-view images or text descriptions at arbitrary part-level states. Specifically, ArtGen employs cross-state Monte Carlo sampling to explicitly enforce global kinematic consistency, reducing structural-motion entanglement. Additionally, we integrate a Chain-of-Thought reasoning module to infer robust structural priors, such as part semantics, joint types, and connectivity, guiding a sparse-expert Diffusion Transformer to specialize in diverse kinematic interactions. Furthermore, a compositional 3D-VAE latent prior enhanced with local-global attention effectively captures fine-grained geometry and global part-level relationships. Extensive experiments on the PartNet-Mobility benchmark demonstrate that ArtGen significantly outperforms state-of-the-art methods.

Subject: Computer Vision and Pattern Recognition

Publish: 2025-12-13 17:00:03 UTC

2512.12395

#1 ArtGen: Conditional Generative Modeling of Articulated Objects in Arbitrary Part-Level States [PDF2] [Copy] [Kimi] [REL]

#1 ArtGen: Conditional Generative Modeling of Articulated Objects in Arbitrary Part-Level States [PDF²] [Copy] [Kimi] [REL]