reproducibilityindex.ai

Improving Viewpoint-Independent Object-Centric Representations through Active Viewpoint Selection

Authors: Yinxuan Huang, Chengmin Gao, Bin Li, Xiangyang Xue

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate our proposed model, we focus on four tasks: unsupervised object segmentation, scene reconstruction, compositional generation, and novel viewpoint synthesis. For the first two tasks, we compare our active viewpoint selection strategy with the random viewpoint selection strategy to highlight its superiority. Additionally, we evaluate our model against other multi-viewpoint approaches, including SIMONe [7] and OCLOC [9, 10], as well as the single-image-based method LSD [11].
Researcher Affiliation	Academia	Yinxuan Huang, Chengmin Gao, Bin Li , Xiangyang Xue Shanghai Key Laboratory of Intelligent Information Processing School of Computer Science, Fudan University yxhuang22@m.fudan.edu.cn, {19210240036, libin, xyxue}@fudan.edu.cn
Pseudocode	Yes	Algorithm 1: Multi-Viewpoint Slot Attention Algorithm 2: Active Viewpoint Selection Algorithm
Open Source Code	Yes	We will provide open access to the data and code on Git Hub at https: //github.com/Yinxuan H/active-viewpoint-selection.
Open Datasets	Yes	We generated three synthetic multi-object multi-viewpoint datasets, referred to as CLEVRTEX, GSO, and Shape Net, to evaluate the performance of our model. These datasets were constructed based on the CLEVRTEX dataset [25], the GSO dataset [26], and the Shape Net dataset [27], respectively. They were created using the official code provided by CLEVRTEX [25] and Kubric [28].
Dataset Splits	Yes	Table 2: Configurations of datasets Datasets CLEVRTEX GSO/Shape Net Split Train Valid Test Train Vaid Test # of Images 5000 100 100 5000 100 100
Hardware Specification	Yes	We train our model on 4 NVIDIA RTX 4090 GPUs over 4.5 days, while SIMONe is trained in 1.5 days, OCLOC in 2.5 days, and LSD in 1.5 days, all using the same GPU setup.
Software Dependencies	No	The paper mentions software like PyTorch and DINO but does not specify their version numbers (e.g., "We implement SIMONe using the Py Torch framework." and "Following DINOSAUR [32], we utilize the pretrained DINO to extract features from images").
Experiment Setup	Yes	Table 3: Hyperparameters of our model used in experiments. Lists detailed parameters such as Batch Size, Training Steps, Input Resolution, Patch Size, Channel Multipliers, Learning Rate, # Iterations, Slot Attr Size, Slot View Size, # Slots, etc. for various modules.