Implicit Generation and Modeling with Energy Based Models
Authors: Yilun Du, Igor Mordatch
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present techniques to scale MCMC based EBM training on continuous neural networks, and we show its success on the high-dimensional data domains of Image Net32x32, Image Net128x128, CIFAR-10, and robotic hand trajectories, achieving better samples than other likelihood models and nearing the performance of contemporary GAN approaches, while covering all modes of the data. We empirically demonstrate its effectiveness on a series of tasks, including image modeling, trajectory modeling, out-of-distribution detection, continual learning, adversarial robustness and compositional generation. |
| Researcher Affiliation | Collaboration | Yilun Du MIT CSAIL Igor Mordatch Google Brain Work done at Open AI Correspondence to: yilundu@mit.edu |
| Pseudocode | Yes | Algorithm 1 Energy training algorithm |
| Open Source Code | Yes | Additional results, source code, and pre-trained models are available at https://sites.google.com/view/igebm |
| Open Datasets | Yes | Image Net32x32, Image Net128x128, CIFAR-10, d Sprites dataset [Higgins et al., 2017], Split MNIST task proposed in [Farquhar and Gal, 2018] |
| Dataset Splits | No | We generated 200,000 different trajectories of length 100, from a trained policy (with every 4th action set to a random action for diversity), with a 90-10 train-test split. The paper mentions train-test splits for some datasets but does not explicitly state a validation split for any experiment. |
| Hardware Specification | No | The paper mentions training a smaller network due to computational constraints but does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | In all our experiments, we sample from B 95% of the time and from uniform noise otherwise. We ran 20 steps of PGD as in [Madry et al., 2017], on the above logits. To undergo classification, we then ran 10 steps sampling initialized from the starting image (with a bounded deviation of 0.03) from each conditional model, and then classified using the lowest energy conditional class. We train a conditional EBM with 2 layers of 400 hidden units work and compare with a generative conditional VAE baseline with both encoder/decoder having 2 layers of 400 hidden units. |