4D Reconstruction and Tracking
Participants reconstruct dynamic scenes and track points over time from an input video, following the D4RT-style reconstruction and tracking setting.
Reference setting →The 1st Workshop on Physical AI brings together researchers across vision, graphics, robotics, and generative modeling to advance physics-grounded understanding of the real world — from physical property estimation and 3D/4D reconstruction to differentiable simulation and physically plausible generation.
Physical AI seeks to endow AI systems with a deep, physics-grounded understanding of the real world. While recent advances in computer vision have enabled large-scale geometric reconstruction, multimodal reasoning, and generative modeling, most systems remain limited in modeling physical properties — mass, friction, material behavior, structural stability, deformation, and dynamic interactions.
This workshop integrates physical reasoning throughout the pipeline: physical property estimation, physics-informed 3D/4D reconstruction, differentiable simulation, physically plausible generation, and embodied interaction. Rather than treating physics as a downstream refinement, PhysAI positions it as a core inductive bias for representation learning and world modeling.
The workshop fosters interdisciplinary discussion across vision, graphics, robotics, and digital twinning, and is built around 14 invited talks, a panel discussion, and a competition on dynamic 4D reconstruction.
Physical attribute estimation and reasoning — materials, mass, friction, affordances.
Reconstruction from sparse or unconstrained inputs that respects physical priors.
Generative models and world simulators for physically consistent world modeling — a key recent goal of physical-world AI.
Reasoning from images, videos, and multi-modal data about physical behavior.
Robot learning in physics-grounded virtual environments and digital twins.
New benchmarks and datasets for physical scene understanding and reconstruction.
A line-up of leaders shaping the next generation of physics-grounded AI. Listed alphabetically.
* Speaker list is tentative; confirmations to be announced.
Full-day workshop · 14 invited talks · 2 coffee breaks · 45-min panel discussion · competition highlights.
Times are local to ECCV 2026 venue. Final program will be released closer to the event.
A new synthetic benchmark with high-quality multi-view rendering and dense geometric annotations, built with Unreal Engine and assets from Fab Market, Objaverse, and Bedlam2.
Participants reconstruct dynamic scenes and track points over time from an input video, following the D4RT-style reconstruction and tracking setting.
Reference setting →Models generate a new-view video from an input video and a target camera trajectory, following the camera-controlled dynamic rendering setting.
Reference setting →All deadlines 23:59 AoE. Dates will be finalized once ECCV 2026 schedules workshops.
A team spanning vision, graphics, robotics, and generative modeling — across Oxford VGG, NTU PVG, NTU MMLab, ETH Zurich, Google DeepMind, and NAVER LABS Europe.
For questions about the workshop, the challenge, or sponsorship opportunities, please reach out: