Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Learning Physical Graph Representations from Visual Scenes
Authors: Daniel Bear, Chaofei Fan, Damian Mrowca, Yunzhu Li, Seth Alter, Aran Nayebi, Jeremy Schwartz, Li F. Fei-Fei, Jiajun Wu, Josh Tenenbaum, Daniel L. Yamins
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 3 Experiments and Analysis. Datasets, Baselines, and Evaluation Metrics. We compare PSGNet to recent CNN-based object discovery methods based on the quality of the self-supervised scene segmentations that they learn on three datasets. |
| Researcher Affiliation | Academia | 1Department of Psychology, Stanford University 2Department of Computer Science, Stanford University 3Wu Tsai Neurosciences Institute, Stanford University 4MIT CSAIL 5MIT Brain and Cognitive Sciences 6Neurosciences Ph.D. Program, Stanford University |
| Pseudocode | No | The paper describes procedures and architecture components but does not include any explicitly labeled pseudocode or algorithm blocks. It states 'formal definitions and implementation details can be found in the Supplement'. |
| Open Source Code | No | The paper does not contain any explicit statements about open-sourcing code or provide links to a code repository. |
| Open Datasets | Yes | We compare PSGNet to recent CNN-based object discovery methods based on the quality of the self-supervised scene segmentations that they learn on three datasets. Primitives is a synthetic dataset... Playroom is a synthetic dataset... Gibson is a subset of the data from the Gibson1.0 environment [3]... where [3] refers to 'Iro Armeni, Sasha Sax, Amir R Zamir, and Silvio Savarese. Joint 2d-3d-semantic data for indoor scene understanding. ar Xiv:1702.01105, 2017.' |
| Dataset Splits | No | The paper mentions 'held-out validation images' but does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology). |
| Hardware Specification | Yes | We thank Google (TPUv2 team) and the NVIDIA corporation for generous donation of hardware resources. |
| Software Dependencies | Yes | Images in Primitives and Playroom are generated by Three DWorld (TDW), a general-purpose, multi-modal simulation platform built on Unity Engine 2019. |
| Experiment Setup | Yes | We always self-supervise QTR outputs from all PSG levels with the RGB values and the backward temporal difference magnitudes of the PSGNet s input movie, using the standard L2 loss. We also self-supervise a set of QSR outputs from the top PSG level on the bottom-up scene segmentations SL... this uses a softmax cross-entropy loss... Finally, except where indicated, we supervise QTR renderings on actual depth and surface normal vector images provided by the training datasets... |