Unsupervised Learning of 3D Structure from Images

Authors: Danilo Jimenez Rezende, S. M. Ali Eslami, Shakir Mohamed, Peter Battaglia, Max Jaderberg, Nicolas Heess

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate high-quality samples and report log-likelihoods on several datasets, including Shape Net [2], and establish the first benchmarks in the literature. We demonstrate the ability of our model to learn and exploit 3D scene representations in five challenging tasks.
Researcher Affiliation Industry * Google Deep Mind
Pseudocode No The paper does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code No The paper does not provide a direct link to open-source code or explicitly state that code for the described methodology is being released.
Open Datasets Yes We explore four data sets: Necker cubes, Primitives, MNIST3D We extended the MNIST dataset [18] to create a 30 30 30 volumetric dataset. Shape Net The Shape Net dataset [2] is a large dataset of 3D meshes of objects. We experiment with a 40-class subset of the dataset, commonly referred to as Shape Net40.
Dataset Splits No The paper does not explicitly state specific train/validation/test dataset splits with percentages or sample counts. While it mentions 'training' and 'evaluation', precise split details are absent.
Hardware Specification No The paper does not explicitly describe any specific hardware (e.g., CPU, GPU models, memory, cloud instances) used for running its experiments.
Software Dependencies No The paper mentions software components like 'LSTMs', 'Adam optimizer [14]', and 'Open GL [22]', but it does not specify version numbers for these components for reproducibility.
Experiment Setup Yes For all experiments we used LSTMs with 300 hidden neurons and 10 latent variables per generation step. The context encoder fc(c, st 1) was varied for each task. We used the Adam optimizer [14] for all experiments. As meshes are much lower-dimensional than volumes, we set the number of steps to be T = 1 when working with this representation.