Energy-Inspired Models: Learning with Sampler-Induced Distributions
Authors: John Lawson, George Tucker, Bo Dai, Rajesh Ranganath
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We describe and evaluate three instantiations of such models based on truncated rejection sampling, self-normalized importance sampling, and Hamiltonian importance sampling. These models outperform or perform comparably to the recently proposed Learned Accept/Reject Sampling algorithm [5] and provide new insights on ranking Noise Contrastive Estimation [34, 46] and Contrastive Predictive Coding [57]. |
| Researcher Affiliation | Collaboration | Dieterich Lawson Stanford University jdlawson@stanford.edu George Tucker , Bo Dai Google Research, Brain Team {gjt, bodai}@google.com Rajesh Ranganath New York University rajeshr@cims.nyu.edu |
| Pseudocode | Yes | Algorithm 1 TRS(π, U, T) generative process Algorithm 2 SNIS(π, U) generative process Algorithm 3 HIS(π, U, ϵ, α0:T ) generative process |
| Open Source Code | Yes | Code and image samples: sites.google.com/view/energy-inspired-models. |
| Open Datasets | Yes | We evaluated the proposed models on a set of synthetic datasets, binarized MNIST [43] and Fashion MNIST [69], and continuous MINST, Fashion MNIST, and Celeb A [45]. |
| Dataset Splits | No | The paper mentions evaluating on a 'test set' but does not explicitly provide details about training/validation/test splits or a separate validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | No | The paper mentions 'tuning hyperparameters' and refers to 'Appendix D for details on the datasets, network architectures, and other implementation details' but does not provide concrete hyperparameter values or detailed training configurations within the main text. |