Meta-Learning MCMC Proposals

Authors: Tongzhou Wang, YI WU, Dave Moore, Stuart J. Russell

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we evaluate our method of learning neural block proposals against single-site Gibbs sampler as well as several model-specific MCMC methods.
Researcher Affiliation Collaboration Tongzhou Wang Facebook AI Research tongzhou.wang.1994@gmail.com Yi Wu University of California, Berkeley jxwuyi@gmail.com David A. Moore Google davmre@gmail.com Stuart J. Russell University of California, Berkeley russell@cs.berkeley.edu
Pseudocode Yes Algorithm 1 Neural Block Sampling
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper.
Open Datasets Yes We use a dataset of 17494 sentences from Co NLL-2003 Shared Task3. The CRF model is trained with Ada Grad [8] through 10 sweeps over the training dataset. 3https://www.clips.uantwerpen.be/conll2003/ner/. We evaluate the performance of the trained neural block proposal on all 180 grid BNs up to 500 nodes from UAI 2008 inference competition.
Dataset Splits No The paper does not provide specific dataset split information needed to reproduce the data partitioning. It mentions using a 'training dataset' and 'test dataset' but without percentages or specific counts for train/validation/test splits, nor does it refer to predefined splits with citations for reproducibility.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers needed to replicate the experiment.
Experiment Setup Yes In all experiments, we use the following guideline to design the proposal: (1) using small underlying MDNs (we pick networks with two hidden layers and elu activation [6]), and (2) choosing an appropriate distribution to generate parameters of the motif such that the generated parameters could cover the whole space as much as possible. The CRF model is trained with Ada Grad [8] through 10 sweeps over the training dataset.