Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation

Authors: Chen Wang, Chuhao Chen, Yiming Huang, Zhiyang Dou, Yuan Liu, Jiatao Gu, Lingjie Liu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct comprehensive evaluations of our method, demonstrating our model can produce physicsplausible motion trajectories. We further show that the generated trajectories can be used as the input for a trajectory-conditioned video model for image-to-video generation, outperforming existing methods in both visual quality and physical plausibility. [...] 5 Experiments 5.1 Evaluation on Image-to-Video Generation [...] 5.2 Evaluation on Generative Dynamics [...] 5.3 Ablation Study
Researcher Affiliation	Academia	1University of Pennsylvania, 2MIT, 3HKUST equal contribution EMAIL EMAIL; EMAIL
Pseudocode	No	The paper describes the architecture of the generative physics network in Figure 3 and the training losses with equations (6), (7), (8), (10), and (11). However, it does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	We don t provide the code during submission. We will open-source the code, model and checkpoints after acceptance.
Open Datasets	Yes	To make our model handle diverse objects and motion trajectories, we generate data using physics simulation using high-quality 3D objects selected from Objaverse XL [16, 15].
Dataset Splits	Yes	We randomly leave out 100 animations from this dataset as the test set and keep the remaining ones for training.
Hardware Specification	Yes	We train our base model on the 150K elastic subset that contains different force and physical parameters with 6 layers and 256 latent size on 8 NVIDIA L40 GPUs with 48GB GPU memory for 60K iterations with a total batch size of 32, which takes about 30 hours.
Software Dependencies	No	The paper mentions using MPM and rigid body simulators for data generation, and refers to several pre-trained models like Da S [24], SAM [39], SV3D [81], LGM [78], and VGGT [83], and GPT-4o for evaluation. However, it does not provide specific version numbers for these software components or any other ancillary software dependencies.
Experiment Setup	Yes	For metric comparison and ablation, we train our base model on the 150K elastic subset that contains different force and physical parameters with 6 layers and 256 latent size on 8 NVIDIA L40 GPUs with 48GB GPU memory for 60K iterations with a total batch size of 32, which takes about 30 hours. We randomly leave out 100 animations from this dataset as the test set and keep the remaining ones for training. We train a large model of different materials with 12 layers and a 512 latent size on all 550K data with the same iterations and batch size, which takes about 80 hours. We use Adam W optimizer with betas (0.9, 0.999) and a learning rate of 1e-4 with a cosine schedule and a warmup of 100 steps. We clip the gradient with the maximum norm of 1.0 and train with bfloat16 precision. We use a DDIM scheduler for sampling.