Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Perturb-and-max-product: Sampling and learning in discrete energy-based models
Authors: Miguel Lazaro-Gredilla, Antoine Dedieu, Dileep George
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now study the empirical performance of PMP for sampling and learning. We measure performance using the maximum mean discrepancy (MMD) [9] between a set of model samples x and a set of ground truth samples x : ... Our experiments are run on a Ge Force RTX 2080 Ti. |
| Researcher Affiliation | Industry | Miguel Lรกzaro-Gredilla, Antoine Dedieu, Dileep George Vicarious AI SF Bay Area, CA EMAIL |
| Pseudocode | Yes | Algorithm 1: Learning and sampling with perturb-and-max-product (PMP) |
| Open Source Code | Yes | Code to reproduce these experiments can be found at https://github.com/vicariousinc/perturb_ and_max_product/. |
| Open Datasets | Yes | Here we train a restricted Boltzmann machine (RBM) on MNIST data and sample from it using PMP. ... MNIST handwritten digit database, 2010. [18] |
| Dataset Splits | No | The paper uses datasets like MNIST but does not explicitly provide specific training, validation, or test dataset splits with percentages or sample counts. |
| Hardware Specification | Yes | Our experiments are run on a Ge Force RTX 2080 Ti. |
| Software Dependencies | No | All compared methods are coded on JAX [4] and run with changes only in the samplers for maximum comparability of their runtimes. No version number for JAX is provided. |
| Experiment Setup | Yes | We use Adam for 200 iterations with 0.01 learning rate: each iteration considers 100 chains and run 100 full sweeps. (Section 5.1). We use stochastic gradient descent with learning rate 0.01 and train for 200 epochs using minibatches of size 100. (Section 5.5) |