Unsupervised Causal Generative Understanding of Images

Authors: Titas Anciukevicius, Patrick Fox-Roberts, Edward Rosten, Paul Henderson

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments applying our approach to test datasets that have zero probability under the training distribution. These show that it accurately reconstructs a scene s geometry, segments objects and infers their positions, despite not receiving any supervision. Our approach significantly out-performs baselines that do not capture the true causal image generation process. To evaluate our approach (Sec. 4) we create challenging test datasets that have zero probability under the corresponding training distribution, yet share aspects of its structure. We show that our model can generalize to unseen numbers of objects, unseen compositions, and radically new camera viewpoints all significantly better than existing works. We conduct experiments on two synthetic datasets, using our model and three baselines. We also include ablations of our model, without MCMC inference, and with an unstructured latent space.
Researcher Affiliation Collaboration Titas Anciukeviˇcius Snap Inc., University of Edinburgh titas.anciukevicius@gmail.com Patrick Fox-Roberts Snap Inc. Edward Rosten Snap Inc. Paul Henderson University of Glasgow
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement or link indicating that the source code for the methodology described is publicly available.
Open Datasets Yes GQN. We render images of rooms containing several objects (cubes, cylinders, spheres), based on the rooms ring camera dataset of [35]; similar datasets were used in [45, 33], but in all cases without OOD test splits. ARROW. We render images using a modified version of the CLEVR dataset [59], similar to those in [57].
Dataset Splits No The paper describes the general training process and the types of datasets used for training and testing, but it does not provide specific numerical details on how the dataset was split into training, validation, or test sets (e.g., percentages or exact counts) in the main text.
Hardware Specification No The paper states, 'Implementation details for all models (hyperparameters, hardware, etc.) are given in the supplementary material,' but does not provide specific hardware details within the main text.
Software Dependencies No The paper states, 'Implementation details for all models (hyperparameters, hardware, etc.) are given in the supplementary material,' but does not provide specific software dependencies with version numbers within the main text.
Experiment Setup No The paper mentions using 'Adam for optimization [64], β-weighting of KL terms [50], and approximate each of the above expectations by a single sample. We also further approximate Ls by rendering only a random subset of pixels per minibatch.' However, it does not provide specific numerical values for hyperparameters or other system-level training settings in the main text, deferring 'More implementation details' to the supplementary material.