Generative Moment Matching Networks
Authors: Yujia Li, Kevin Swersky, Rich Zemel
ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that this relatively simple, yet very flexible framework is effective at producing good generative models in an efficient manner. On MNIST and the Toronto Face Dataset (TFD) we demonstrate improved results over comparable baselines, including GANs. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Toronto, Toronto, ON, CANADA 2Canadian Institute for Advanced Research, Toronto, ON, CANADA |
| Pseudocode | Yes | Algorithm 1: GMMN minibatch training |
| Open Source Code | Yes | Source code for training GMMNs is available at https:// github.com/yujiali/gmmn. |
| Open Datasets | Yes | We trained GMMNs on two benchmark datasets: MNIST (Le Cun et al., 1998) and the Toronto Face Dataset (TFD) (Susskind et al., 2010). |
| Dataset Splits | Yes | For MNIST, we used the standard test set of 10,000 images, and split out 5000 from the standard 60,000 training images for validation. The remaining 55,000 were used for training. |
| Hardware Specification | No | The paper does not specify the hardware used for running experiments (e.g., CPU, GPU, memory, or cloud instances). |
| Software Dependencies | No | The paper does not explicitly list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions). |
| Experiment Setup | Yes | For all experiments in this section the GMMN networks were trained with minibatches of size 1000, for each minibatch we generated a set of 1000 samples from the network. For all the networks we used in this section, a uniform distribution in [ 1, 1]H was used as the prior for the H-dimensional stochastic hidden layer at the top of the GMMN, which was followed by 4 Re LU layers, and the output was a layer of logistic sigmoid units. Cross entropy was used as the reconstruction loss. Dropout (Hinton et al., 2012b) was used on the encoder layers. |