Explicitly disentangling image content from translation and rotation with spatial-VAE
Authors: Tristan Bepler, Ellen Zhong, Kotaro Kelley, Edward Brignole, Bonnie Berger
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that this framework, termed spatial VAE, effectively learns latent representations that disentangle image rotation and translation from content and improves reconstruction over standard VAEs on several benchmark datasets, including applications to modeling continuous 2-D views of proteins from single particle electron microscopy and galaxies in astronomical images. |
| Researcher Affiliation | Academia | Tristan Bepler Massachusetts Institute of Technology Cambridge, MA tbepler@mit.edu Ellen D. Zhong Massachusetts Institute of Technology Cambridge, MA zhonge@mit.edu Kotaro Kelley New York Structural Biology Center New York, NY kkelley@nysbc.org Edward Brignole Massachusetts Institute of Technology Cambridge, MA brignole@mit.edu Bonnie Berger Massachusetts Institute of Technology Cambridge, MA bab@mit.edu |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source code and data are available at: https://github.com/tbepler/spatial-VAE |
| Open Datasets | Yes | We train spatial-VAE models with rotation and translation inference on these datasets and the original MNIST dataset. ... The galaxy zoo dataset contains 61,578 training color images of galaxies from the Sloan Digital Sky Survey. We crop each image with random translation and downsample to 64x64 pixels following common practice [28]. |
| Dataset Splits | No | The paper specifies train and test splits (e.g., 16,000 train and 4,000 test images), but does not explicitly mention a separate validation set split or its size/percentage. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | Models were implemented using Py Torch [25]. (No version number provided for PyTorch) |
| Experiment Setup | Yes | All models are trained using ADAM [24] with a learning rate of 0.0001 and minibatch size of 100. ... Models were trained for 500 epochs. ... We train spatial-VAEs with 20, 50, and 100 dimension unconstrained latent variables for 300 epochs... |