Conditional Generative Moment-Matching Networks
Authors: Yong Ren, Jun Zhu, Jialian Li, Yucen Luo
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate CGMMN on a wide range of tasks, including predictive modeling, contextual generation, and Bayesian dark knowledge, which distills knowledge from a Bayesian model by learning a relatively small CGMMN student network. Our results demonstrate competitive performance in all the tasks. |
| Researcher Affiliation | Academia | Yong Ren, Jialian Li, Yucen Luo, Jun Zhu Dept. of Comp. Sci. & Tech., TNList Lab; Center for Bio-Inspired Computing Research State Key Lab for Intell. Tech. & Systems, Tsinghua University, Beijing, China {renyong15, luoyc15, jl12}@mails.tsinghua.edu.cn; dcszj@tsinghua.edu.cn |
| Pseudocode | Yes | Algorithm 1 Stochastic gradient descent for CGMMN |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We first present the prediction performance on the widely used MINIST dataset... We then report the prediction performance on the Street View House Numbers (SVHN) dataset... We now show the generating results on the Extended Yale Face dataset... Due to the space limitation, we test our model on a regression problem on the Boston housing dataset. |
| Dataset Splits | Yes | The whole dataset is divided into 3 parts with 50, 000 training examples, 10, 000 validation examples and 10, 000 testing examples. |
| Hardware Specification | No | The paper does not specify any particular hardware used for running the experiments, such as specific CPU or GPU models. |
| Software Dependencies | No | The paper mentions software like "Ada M [13]" and activation functions like "Re Lu" but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | In the MLP case, the model architecture is shown in Fig. 1 with an uniform distribution for hidden variables of dimension 5... The MLP has 3 hidden layers with hidden unit number (500, 200, 100) with the Re Lu activation function. A minibatch size of 500 is adopted. In the CNN case, we use the same architecture as [18], where there are 32 feature maps in the first two convolutional layers and 64 feature maps in the last three hidden layers. An MLP of 500 hidden units is adopted at the end of convolutional layers. The Re Lu activation function is used in the convoluational layers and sigmoid function in the last layer. We do not pre-train our model and a minibatch size of 500 is adopted as well. In both settings, we use Ada M [13] to optimize parameters. |