Bounding the Test Log-Likelihood of Generative Models
Authors: Yoshua Bengio; Li Yao; KyungHyun Cho
ICLR 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we empirically evaluate the CSL estimator on a real dataset to investigate the rate at which the estimator converges. We report here the experimental result on denoising auto-encoders (DAE), generative stochastic networks (GSN), restricted Boltzmann machines (RBM), deep Boltzmann machines (DBM), and deep belief nets (DBNs). All models in these experiments were trained on the binarized MNIST data (thresholding at 0.5). The CSL estimates of the test set on the following models were evaluated. For each model, every 100-th sample from a Markov chain was collected to compute the CSL estimate. For more details on the architecture and training procedure of each model, see Appendix A. Table 1: The CSL estimates obtained using different numbers of samples of latent variables. |
| Researcher Affiliation | Academia | Yoshua Bengio 1,2, Li Yao 2, and Kyunghyun Cho 3 1CIFAR Senior Fellow 2D epartement d Informatique et de Recherche Op erationelle , Universit e de Montr eal 3Department of Information and Computer Science , Aalto University School of Science |
| Pseudocode | Yes | Algorithm 1 CSL requires a set S of samples of the latent variables h from a Markov chain, a conditional distribution P(x|h), and a set X of test samples. 2: for x in X do 3: r = 0 4: for h in S do 5: r r + P(x|h ) 6: end for 7: ˆf S(x) = r |S| 8: LL LL + log ˆf S(x) 9: end for 10: Return LL/|X| |
| Open Source Code | No | The paper does not provide any specific statement about making the source code for their proposed methodology available, nor does it include any repository links. |
| Open Datasets | Yes | All models in these experiments were trained on the binarized MNIST data (thresholding at 0.5). |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) needed to reproduce the data partitioning for their own experiments. It only mentions a “validation set” in the context of tuning a previous estimator, not for their own CSL method. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. It only generally thanks 'Compute Canada, and Calcul Qu ebec for funding', which are computing consortia, not specific hardware. |
| Software Dependencies | No | The paper mentions 'Theano (Bergstra et al., 2010; Bastien et al., 2012)' but does not provide specific version numbers for Theano or any other software dependencies crucial for replication. |
| Experiment Setup | Yes | For more details on the architecture and training procedure of each model, see Appendix A. Appendix A provides details such as 'Noise: (input) 0.28 salt-and-pepper, (hidden) no noise Learning: 9-step walkback (Bengio, 2013), learning rate 0.05, cross-entropy cost, 200 epochs' for GSN-1. |