Unsupervised Causal Generative Understanding of Images
Authors: Titas Anciukevicius, Patrick Fox-Roberts, Edward Rosten, Paul Henderson
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments applying our approach to test datasets that have zero probability under the training distribution. These show that it accurately reconstructs a scene s geometry, segments objects and infers their positions, despite not receiving any supervision. Our approach significantly out-performs baselines that do not capture the true causal image generation process. To evaluate our approach (Sec. 4) we create challenging test datasets that have zero probability under the corresponding training distribution, yet share aspects of its structure. We show that our model can generalize to unseen numbers of objects, unseen compositions, and radically new camera viewpoints all significantly better than existing works. We conduct experiments on two synthetic datasets, using our model and three baselines. We also include ablations of our model, without MCMC inference, and with an unstructured latent space. |
| Researcher Affiliation | Collaboration | Titas Anciukeviˇcius Snap Inc., University of Edinburgh titas.anciukevicius@gmail.com Patrick Fox-Roberts Snap Inc. Edward Rosten Snap Inc. Paul Henderson University of Glasgow |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the methodology described is publicly available. |
| Open Datasets | Yes | GQN. We render images of rooms containing several objects (cubes, cylinders, spheres), based on the rooms ring camera dataset of [35]; similar datasets were used in [45, 33], but in all cases without OOD test splits. ARROW. We render images using a modified version of the CLEVR dataset [59], similar to those in [57]. |
| Dataset Splits | No | The paper describes the general training process and the types of datasets used for training and testing, but it does not provide specific numerical details on how the dataset was split into training, validation, or test sets (e.g., percentages or exact counts) in the main text. |
| Hardware Specification | No | The paper states, 'Implementation details for all models (hyperparameters, hardware, etc.) are given in the supplementary material,' but does not provide specific hardware details within the main text. |
| Software Dependencies | No | The paper states, 'Implementation details for all models (hyperparameters, hardware, etc.) are given in the supplementary material,' but does not provide specific software dependencies with version numbers within the main text. |
| Experiment Setup | No | The paper mentions using 'Adam for optimization [64], β-weighting of KL terms [50], and approximate each of the above expectations by a single sample. We also further approximate Ls by rendering only a random subset of pixels per minibatch.' However, it does not provide specific numerical values for hyperparameters or other system-level training settings in the main text, deferring 'More implementation details' to the supplementary material. |