reproducibilityindex.ai

Unsupervised Discovery of Object Radiance Fields

Authors: Hong-Xing Yu, Leonidas Guibas, Jiajun Wu

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate u ORF on factorized scene representation learning (e.g., segmentation in 3D) and scene generation (e.g., novel view synthesis, scene editing in 3D). Our evaluation is on three datasets with a gradually increasing complexity: first, CLEVR-like scenes with primitives foreground shapes; second, room scenes with complex chair shapes and textured backgrounds; third, more diverse room scenes with various foreground shapes and backgrounds. Our results show that u ORF learns factorized representations that can segment 3D scenes into objects with fine shape details (e.g., thin chair legs) and backgrounds with well-recovered appearance details (e.g., irregular textures of a wooden floor).
Researcher Affiliation	Academia	Hong-Xing Yu Stanford University Leonidas J. Guibas Stanford University Jiajun Wu Stanford University
Pseudocode	Yes	We show pseudo-code of our background-aware slot attention in Appendix (Alg. 1).
Open Source Code	Yes	*Code and data can be found at https://kovenyu.com/u ORF/. To ensure reproducibility of our work, we have provided the training and test code repository , together with all three synthetic datasets, and pre-trained models on all three datasets.
Open Datasets	Yes	CLEVR-567. The ﬁrst dataset includes scenes of 5 7 CLEVR objects (Johnson et al., 2017)... Room-Diverse. The third dataset includes scenes of diverse foreground object shapes and background appearances... whose shape is randomly sampled from 1,200 Shape Net chair shapes (Chang et al., 2015)... To ensure reproducibility of our work, we have provided the training and test code repository , together with all three synthetic datasets, and pre-trained models on all three datasets.
Dataset Splits	No	The paper specifies training and testing sets, but does not explicitly mention a dedicated 'validation' set or its split percentages/counts for model tuning.
Hardware Specification	Yes	Our model is trained on a single Nvidia RTX 3090 GPU for about 6 days.
Software Dependencies	No	The paper mentions software components like 'Adam optimizer', 'VGG16', and 'Style GAN2' but does not provide specific version numbers for these or other libraries/frameworks.
Experiment Setup	Yes	We set λpercept = 0.006, λadv = 0.01, λR = 10. For coarse training, we bilinearly downsample supervision images to 64 64. The coarse training lasts for 600K iterations. For ﬁne training, we randomly crop 64 64 patches from 128 128 images. The ﬁne training lasts for 600K iterations. For all networks except discriminator, we use Adam optimizer with learning rate 0.0003, β1 = 0.9 and β2 = 0.999. Learning rate is exponentially decreased by half for every 200K iterations until after 600K iterations. We also adopt the learning rate warm-up from the slot attention paper (Locatello et al., 2020) for the ﬁrst 1K iterations. We render each pixel with 64 samples.