Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
DynaVol: Unsupervised Learning for Dynamic Scenes through Object-Centric Voxelization
Authors: Yanpeng Zhao, Siyu Gao, Yunbo Wang, Xiaokang Yang
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we initially evaluate Dyna Vol on simulated 3D dynamic scenes that contain different numbers of objects, diverse motions, shapes (such as cubes, spheres, and real-world shapes), and materials (such as rubber and metal). On the simulated dataset, we can directly assess the performance of Dyna Vol for scene decomposition by projecting the object-centric volumetric representations onto 2D planes, and compare it with existing approaches, such as SAVi (Kipf et al., 2022) and u ORF (Yu et al., 2022). Additionally, we demonstrate the effectiveness of Dyna Vol in novel view synthesis and dynamic scene editing using real-world videos. |
| Researcher Affiliation | Academia | Yanpeng Zhao Siyu Gao Yunbo Wang Xiaokang Yang Mo E Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University EMAIL |
| Pseudocode | Yes | Algorithm 1 Pseudocode of the 3D-to-4D voxel expansion algorithm |
| Open Source Code | No | The paper provides a link to a project page (https://sites.google.com/view/dynavol/), which states 'Code coming soon!' at the time of review. This does not constitute concrete access to source code for the methodology described. |
| Open Datasets | Yes | We build the 8 synthetic dynamic scenes in Table 1 using the Kubric simulator (Greff et al., 2022). Each scene spans 60 timestamps and contains different numbers of objects in various colors, shapes, and textures. We also adopt 4 real-world scenes from Hyper Ne RF (Park et al., 2021) and D2Ne RF (Wu et al., 2022), as shown in Table 2. |
| Dataset Splits | No | The paper describes how data is used (e.g., 'evaluated over 60 novel views'), but it does not provide specific percentages or sample counts for train/validation/test dataset splits needed for reproducibility of data partitioning. |
| Hardware Specification | Yes | All experiments run on an NVIDIA RTX3090 GPU and last for about 3 hours. |
| Software Dependencies | No | The paper mentions using the Adam optimizer but does not specify versions of programming languages, libraries, or other software dependencies required to reproduce the experiments. |
| Experiment Setup | Yes | We set the size of the voxel grid to 1103, the assumed number of maximum objects to N = 10, and the dimension of slot features to D = 64. We use 4 hidden layers with 64 channels in the renderer, and use the Adam optimizer with a batch of 1,024 rays in the two training stages. The base learning rates are 0.1 for the voxel grids and 1e 3 for all model parameters in the warmup stage and then adjusted to 0.08 and 8e 4 in the second training stage. The two training stages last for 50k and 35k iterations respectively. The hyperparameters in the loss functions are set to αp = 0.1, αe = 0.01, αw = 1.0, αc = 1.0. |