Meta-Learning MCMC Proposals
Authors: Tongzhou Wang, YI WU, Dave Moore, Stuart J. Russell
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate our method of learning neural block proposals against single-site Gibbs sampler as well as several model-speciļ¬c MCMC methods. |
| Researcher Affiliation | Collaboration | Tongzhou Wang Facebook AI Research tongzhou.wang.1994@gmail.com Yi Wu University of California, Berkeley jxwuyi@gmail.com David A. Moore Google davmre@gmail.com Stuart J. Russell University of California, Berkeley russell@cs.berkeley.edu |
| Pseudocode | Yes | Algorithm 1 Neural Block Sampling |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. |
| Open Datasets | Yes | We use a dataset of 17494 sentences from Co NLL-2003 Shared Task3. The CRF model is trained with Ada Grad [8] through 10 sweeps over the training dataset. 3https://www.clips.uantwerpen.be/conll2003/ner/. We evaluate the performance of the trained neural block proposal on all 180 grid BNs up to 500 nodes from UAI 2008 inference competition. |
| Dataset Splits | No | The paper does not provide specific dataset split information needed to reproduce the data partitioning. It mentions using a 'training dataset' and 'test dataset' but without percentages or specific counts for train/validation/test splits, nor does it refer to predefined splits with citations for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers needed to replicate the experiment. |
| Experiment Setup | Yes | In all experiments, we use the following guideline to design the proposal: (1) using small underlying MDNs (we pick networks with two hidden layers and elu activation [6]), and (2) choosing an appropriate distribution to generate parameters of the motif such that the generated parameters could cover the whole space as much as possible. The CRF model is trained with Ada Grad [8] through 10 sweeps over the training dataset. |