Contrastive Training of Complex-Valued Autoencoders for Object Discovery

Authors: Aleksandar Stanić, Anand Gopalakrishnan, Kazuki Irie, Jürgen Schmidhuber

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here we provide our experimental results. We first describe details of the datasets, baseline models, training procedure and evaluation metrics. We then show results (always across 5 seeds) on grouping of our Ct CAE model compared to the baselines (CAE and our variant CAE++), separability in phase space, generalization capabilities w.r.t to number of objects seen at train/test time and ablation studies for each of our design choices.
Researcher Affiliation Academia Aleksandar Stani c1 Anand Gopalakrishnan1 Kazuki Irie2 Jürgen Schmidhuber1,3 1The Swiss AI Lab, IDSIA, USI & SUPSI, Lugano, Switzerland 2Center for Brain Science, Harvard University, Cambridge, USA 3AI Initiative, KAUST, Thuwal, Saudi Arabia {aleksandar, anand, juergen}@idsia.ch kirie@fas.harvard.edu
Pseudocode Yes Algorithm 1 Mining positive and negative pairs for a single anchor for the contrastive objective. Scalar parameters are ktop the number of candidates from which to sample one positive pair and mbottom the number of candidates from which to sample (M 1) negative pairs.
Open Source Code Yes 1Official code repository: https://github.com/agopal42/ctcae
Open Datasets Yes We evaluate the models on three datasets from the Multi-Object datasets suite [41] namely Tetrominoes, d Sprites and CLEVR (Figure 2) used by prior work in object-centric learning [14, 15, 45]. ... [41] Rishabh Kabra, Chris Burgess, Loic Matthey, Raphael Lopez Kaufman, Klaus Greff, Malcolm Reynolds, and Alexander Lerchner. Multi-object datasets. https://github.com/deepmind/multiobject-datasets/, 2019.
Dataset Splits No In Tetrominoes and d Sprites the number of training images is 60K whereas in CLEVR it is 50K. All three datasets have 320 test images on which we report all the evaluation metrics. This specifies training and test set sizes but does not mention a validation set or its specific split details.
Hardware Specification Yes We report all training time(s) using a single Nvidia V100 GPU.
Software Dependencies No The paper mentions optimizers like 'Adam' and components like 'Batch Norm' and 'ReLU' within the model architecture. However, it does not provide specific version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or solvers used.
Experiment Setup Yes Table 7: General training hyperparameters. Hyperparameter Tetrominoes d Sprites CLEVR Training Steps 50 000 100 000 100 000 Batch size 64 64 64 Learning rate 4e-4 4e-4 4e-4. Table 8: Contrastive learning hyperparameters for the Ct CAE model. Common Hyperparameters Loss coefficient (in total loss sum) 1e-4 Temperature 0.05 Contrastive Learning Addresses Magnitude Contrastive Learning Features Phase ...