Object-centric Learning with Cyclic Walks between Parts and Whole
Authors: Ziyu Wang, Mike Zheng Shou, Mengmi Zhang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our rigorous experiments on seven image datasets in three unsupervised tasks demonstrate that the networks trained with our cyclic walks can disentangle foregrounds and backgrounds, discover objects, and segment semantic objects in complex scenes. |
| Researcher Affiliation | Academia | 1Show Lab, National University of Singapore, Singapore 2Deep Neuro Cognition Lab, CFAR and I2R, Agency for Science, Technology and Research, Singapore 3Nanyang Technological University, Singapore |
| Pseudocode | No | No explicit pseudocode or algorithm blocks labeled as such were found in the paper. The methodology is described in prose and mathematical formulations. |
| Open Source Code | Yes | Our source code and data are available at: link. |
| Open Datasets | Yes | We evaluate the quality of the predicted foreground and background masks with mean Intersection over Union (m Io U) [22] and Dice [22]. We include Stanford Dogs [23], Stanford Cars [25], CUB 200 Birds [43], and Flowers [32] as benchmark datasets. We benchmark all methods on the common datasets Pascal VOC 2012 [15], COCO 2017 [27], Movi-C [17] and Movi-E[17]. |
| Dataset Splits | Yes | We trained all models (Slot-Attention, SLATE, BO-QSA, DINOSAUR, and our Cyclic walks) with 250k training steps and selected their best models by keeping track of their best accuracies on the validation sets. |
| Hardware Specification | Yes | All models are trained on 4 Nvidia RTX A5000 GPUs with a total batch size of 128. All these experiments are run with the same hardware specifications and method configurations: (1) one single RTX-A5000 GPU; |
| Software Dependencies | No | No specific version numbers for software dependencies (e.g., 'PyTorch 1.9', 'Python 3.8') were provided. It mentions 'Adam W optimizer' and 'Pytorch automatic mixed-precision' but without versions. |
| Experiment Setup | Yes | Our model is optimized by Adam W optimizer [30] with a learning rate of 0.0004, 250k training steps, linearly warm-up of 5000 steps, and an exponentially weight-decaying schedule. The gradient norm is clipped at 1. ... The temperature of cyclic walks is set to 0.1. We use similarity threshold 0.7 and Vi T-S8 of DINO [9] for all experiments. |