reproducibilityindex.ai

Multi-Object Representation Learning via Feature Connectivity and Object-Centric Regularization

Authors: Alex Foo, Wynne Hsu, Mong Li Lee

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on simulated, real-world, complex texture and common object images demonstrate a substantial improvement in the quality of discovered objects compared to state-of-the-art methods, as well as the sample efﬁciency and generalizability of our approach. We also show that the discovered object-centric representations can accurately predict key object properties in downstream tasks, highlighting the potential of our method to advance the ﬁeld of multi-object representation learning.
Researcher Affiliation	Academia	Alex Foo Wynne Hsu Mong Li Lee School of Computing National University of Singapore {alexfoo,whsu,leeml}@comp.nus.edu.sg
Pseudocode	No	The paper describes the methodology and algorithms in prose and mathematical equations but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about open-sourcing its code or a direct link to a code repository for the methodology described.
Open Datasets	Yes	Table 1: Summary of dataset characteristics (Multi-d Sprites, Tetrominoes-NM, SVHN, IDRi D, CLEVRTEX, CLEVRTEX-OOD, Flowers, Birds, COCO). Multi-d Sprites [23] and Tetrominoes-NM. The former consists of multiple oval, heart or square-shaped sprites with some occlusions, while the latter is a subset of the original Tetrominoes dataset [23] where images whose ground truth segmentation requires knowledge of the object shapes are ﬁltered out.
Dataset Splits	Yes	Following [35, 11], we use the ﬁrst 60K samples in Multid Sprites, Tetrominoes-NM and SVHN for training and hold out the next 320 samples for testing. For IDRi D, we split this dataset into 54 images for training and 27 images for testing. For CLEVRTEX, we use the ﬁrst 40K samples for training and last 5K samples for testing. For CLEVRTEX-OOD, we use 10K samples for testing. For Flowers, we use the ﬁrst 6K samples for training and last 1K samples for testing. For Birds, we use the ﬁrst 10K samples for training and last 1K samples for testing. For COCO, we use the ﬁrst 10K samples for training and last 2K samples for testing.
Hardware Specification	Yes	Training on 64-by-64 images from Multi-d Sprites on a single V100 GPU with 32GB of RAM takes about 10 minutes.
Software Dependencies	No	The paper mentions using "Adam [28]" as an optimizer and cites "Pytorch [39]", but it does not specify version numbers for PyTorch or any other software dependencies.
Experiment Setup	Yes	We train OC-Net for 1000 iterations with a batch size of 64 using Adam [28] with a learning rate of 1 10 3. We carried out an initial experiment to choose the clustering threshold. The results show that the value can range from 0.2 to 2.0 without affecting the performance of OC-Net. As such, we set the threshold to = 0.7 so that two pixels will belong to the same object if their normalized feature similarity is more than 50%. If a pixel is assigned to multiple objects, we assign it to the mask of the ﬁrst object in that list and ignore its membership in other objects. For all methods, we set the maximum number of foreground objects to 6 and 4 for Multi-d Sprites and Tetrominoes respectively. Training is carried out for 300,000 iterations with a batch size of 64, using the Adam optimizer with a base learning rate of 4 10 4. We set the size of the latent space to be D = 64 for all models. For SVHN and COCO, the number of objects is set to 6. For IDRi D, the number of objects is set to 20 and we train them for 100,000 iterations. For CLEVRTEX and CLEVRTEX-OOD, the number of objects is set to 11. For Flowers and Birds, the number of objects is set to 2.