Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
ROOTS: Object-Centric Representation and Rendering of 3D Scenes
Authors: Chang Chen, Fei Deng, Sungjin Ahn
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, in addition to generation quality, we also demonstrate that the learned representation permits object-wise manipulation and novel scene generation, and generalizes to various settings. Results can be found on our project website: https://sites.google.com/view/roots3d. Keywords: object-centric representations, latent variable models, 3D scene generation, variational inference, 3D-aware representations |
| Researcher Affiliation | Academia | Chang Chen EMAIL Department of Computer Science Rutgers University Piscataway, NJ 08854, USA |
| Pseudocode | Yes | Appendix D. Summary of ROOTS Encoder and Decoder Algorithm 1 ROOTS Encoder Algorithm 2 ROOTS Decoder |
| Open Source Code | No | The paper states: "Results can be found on our project website: https://sites.google.com/view/roots3d." However, this only refers to results and does not explicitly state that the source code for the methodology is available. |
| Open Datasets | Yes | For evaluation on realistic objects, we also included a publicly available Shape Net arrangement data set (Tung et al., 2019; Cheng et al., 2018). |
| Dataset Splits | Yes | Both data sets contain 60K multi-object scenes (50K for training, 5K for validation, and 5K for testing) with complete groundtruth scene specifications including object shapes, colors, positions, and sizes. Each scene is rendered as 128 128 color images from 30 random viewpoints. During training, we sample 10-20 viewpoints uniformly at random as contexts and use the rest as queries. For evaluation and visualization, we use 15 viewpoints as contexts and the rest as queries. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processors, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions software components like "RMSprop (Hinton et al., 2012)", "Conv DRAW (Gregor et al., 2016)", "Adam (Kingma and Ba, 2015)" and refers to "the official implementation" of IODINE, but does not provide specific version numbers for any of these software dependencies. |
| Experiment Setup | Yes | ROOTS is trained for 200 epochs with a batch size of 12, using RMSprop (Hinton et al., 2012) with learning rates chosen from {1 10 3, 3 10 4, 1 10 4, 3 10 5}. We set γ = 7 and ρ = 0.999 during training, thereby encouraging the model to decompose the scenes into as few objects as possible. |