reproducibilityindex.ai

Amortized Bethe Free Energy Minimization for Learning MRFs

Authors: Sam Wiseman, Yoon Kim

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, we find that the proposed approach compares favorably with loopy belief propagation, but is faster, and it allows for attaining better held out log likelihood than other recent approximate inference schemes. In Table 1 we show the correlation and the mean L1 distance between the true vs. approximated marginals for the various methods. In Table 2 we show results from learning the generative model alongside the inference network. Table 3 reports the held out average NLL of learned RBMs, as estimated by AIS [46].
Researcher Affiliation	Academia	Sam Wiseman Toyota Technological Institute at Chicago Chicago, IL, USA swiseman@ttic.edu Yoon Kim Harvard University Cambridge, MA, USA yoonkim@seas.harvard.edu
Pseudocode	Yes	Algorithm 1 Saddle-point MRF Learning
Open Source Code	Yes	code for duplicating experiments is available at https://github.com/swiseman/bethe-min.
Open Datasets	Yes	We train RBMs with 100 hidden units on the UCI digits dataset [1]... We consider learning a K = 30 state 3rd order directed neural HMM on sentences from the Penn Treebank [32].
Dataset Splits	Yes	For a randomly generated Ising model, we obtain 1000 samples each for train, validation, and test sets... We used a batch size of 32, and selected hyperparameters through random search, monitoring validation expected pseudo-likelihood [3] for all models; see the Supplementary Material. ...on sentences from the Penn Treebank [32] (using the standard splits and preprocessing by Mikolov et al. [35])
Hardware Specification	Yes	speed results were measured on the same 1080 Ti GPU
Software Dependencies	No	The paper does not mention specific version numbers for software dependencies. It implies the use of common deep learning frameworks but without the required version details.
Experiment Setup	Yes	We used a batch size of 32, and selected hyperparameters through random search... We train inference networks f and fx to output pseudo-marginals τ and τ x as in Algorithm 1, using I1 = 1 and I2 = 1 gradient updates per minibatch.