Slot-VAE: Object-Centric Scene Generation with Slot Attention
Authors: Yanbo Wang, Letao Liu, Justin Dauwels
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments are to evaluate: i) image decomposition performance, ii) sample quality and structure accuracy of generated samples, iii) and disentanglement performance. Our extensive evaluation of the scene generation ability indicates that Slot-VAE outperforms slot representation-based generative baselines in terms of sample quality and scene structure accuracy. |
| Researcher Affiliation | Academia | 1Department of EEMCS, Delft University of Technology, Delft, Netherlands 2School of EEE, Nanyang Technological University, Singapore. Correspondence to: Yanbo Wang <y.wang27@tudelft.nl>. |
| Pseudocode | No | The paper describes the model architecture and training process in detail across sections 3 and Appendix B, but does not include a formal 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper mentions that 'Some of the baseline models already released their trained models for Object Room, Shape Stacks or Arrow Room.', but it does not state that the code for Slot-VAE is open-source or provide a link. |
| Open Datasets | Yes | The experiments involve three datasets including Object Room (Kabra et al., 2019), Shape Stacks (Groth et al., 2018) and Arrow Room(Jiang & Ahn, 2020). Kabra, R., Burgess, C., Matthey, L., Kaufman, R. L., Greff, K., Reynolds, M., and Lerchner, A. Multiobject datasets. https://github.com/deepmind/multiobject-datasets/, 2019. |
| Dataset Splits | No | The paper mentions '10000 warm-up steps are used' and specifies 'batch size' and 'learning rate' for each dataset, but it does not explicitly provide the training, validation, and test dataset splits (e.g., percentages or counts) or refer to a standard split used for the experiments. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, or memory) used to run the experiments. |
| Software Dependencies | No | The paper does not list any specific software dependencies or libraries with their version numbers. |
| Experiment Setup | Yes | In the experiments, 10000 warm-up steps are used. For Object Room, the batch size is 64, and the learning rate is 0.0004; for Shape Stacks, the batch size is 32, and the learning rate is 0.0001; and for Arrow Room, the batch size is 32, and the learning rate is 0.0001 in the early training steps and is decreased to 0.00005 after object-centric representations show up for stable training purpose. In the experiments, for Object Room, β is 0.01; for Shape Stacks, β is 0.1; and for Arrow Room, β is 0.1. |