42325@AAAI

Total: 1

#1 Physics Consistent World Models via Schrödinger-Bridge Optimal Transport for Computational Imaging and 3D-Consistent Video Generations [PDF] [Copy] [Kimi] [REL]

Author: Abhiram Srivatsa Kadaba

Modern generative models often violate basic physical principles. Shadows drift, geometry becomes inconsistent across views, and measurement models are ignored, which limits trust in both video synthesis and computational imaging. We propose a finite time Schrödinger Bridge (SB) world model that formulates generation as entropy regularized optimal transport from a simple prior to a distribution that is consistent with both data and physics. Instead of applying consistency corrections only at the final output, the framework introduces geometric and physical structure directly along the generative path. For video, the model enforces multiview geometric constraints through reprojection and epipolar agreement, homographies, and depth guided warping. For imaging, it incorporates differentiable optical operators, including point spread function based defocus models and lightweight Fourier propagation for coherent and partially coherent settings. When camera poses are known, the model penalizes reprojection error and warp aligned photometric or feature inconsistencies. When poses are unknown, a compact motion or flow estimator encourages cycle consistent trajectories. A lightweight UNet or Vision Transformer backbone, together with a short SB horizon, maintains computational efficiency. Evaluation will measure three dimensional and temporal consistency, physics fidelity through forward simulation residuals, and overall generative quality and efficiency using FID, KID, and FVD. Comparisons will include modern video diffusion models, plug and play data consistency methods, and unconstrained SB variants. The central hypothesis is that constraining the entire generative trajectory, rather than only the final frame, can shorten sampling while improving cross view coherence and physical plausibility across diverse sensing modalities, including cameras, microscopes, and medical imaging systems.

Subject: AAAI.2026 - Undergraduate Consortium