Object-centric Learning with Cyclic Walks between Parts and Whole

Authors: Ziyu Wang, Mike Zheng Shou, Mengmi Zhang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our rigorous experiments on seven image datasets in three unsupervised tasks demonstrate that the networks trained with our cyclic walks can disentangle foregrounds and backgrounds, discover objects, and segment semantic objects in complex scenes.
Researcher Affiliation Academia 1Show Lab, National University of Singapore, Singapore 2Deep Neuro Cognition Lab, CFAR and I2R, Agency for Science, Technology and Research, Singapore 3Nanyang Technological University, Singapore
Pseudocode No No explicit pseudocode or algorithm blocks labeled as such were found in the paper. The methodology is described in prose and mathematical formulations.
Open Source Code Yes Our source code and data are available at: link.
Open Datasets Yes We evaluate the quality of the predicted foreground and background masks with mean Intersection over Union (m Io U) [22] and Dice [22]. We include Stanford Dogs [23], Stanford Cars [25], CUB 200 Birds [43], and Flowers [32] as benchmark datasets. We benchmark all methods on the common datasets Pascal VOC 2012 [15], COCO 2017 [27], Movi-C [17] and Movi-E[17].
Dataset Splits Yes We trained all models (Slot-Attention, SLATE, BO-QSA, DINOSAUR, and our Cyclic walks) with 250k training steps and selected their best models by keeping track of their best accuracies on the validation sets.
Hardware Specification Yes All models are trained on 4 Nvidia RTX A5000 GPUs with a total batch size of 128. All these experiments are run with the same hardware specifications and method configurations: (1) one single RTX-A5000 GPU;
Software Dependencies No No specific version numbers for software dependencies (e.g., 'PyTorch 1.9', 'Python 3.8') were provided. It mentions 'Adam W optimizer' and 'Pytorch automatic mixed-precision' but without versions.
Experiment Setup Yes Our model is optimized by Adam W optimizer [30] with a learning rate of 0.0004, 250k training steps, linearly warm-up of 5000 steps, and an exponentially weight-decaying schedule. The gradient norm is clipped at 1. ... The temperature of cyclic walks is set to 0.1. We use similarity threshold 0.7 and Vi T-S8 of DINO [9] for all experiments.