Conditional Generative Moment-Matching Networks

Authors: Yong Ren, Jun Zhu, Jialian Li, Yucen Luo

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate CGMMN on a wide range of tasks, including predictive modeling, contextual generation, and Bayesian dark knowledge, which distills knowledge from a Bayesian model by learning a relatively small CGMMN student network. Our results demonstrate competitive performance in all the tasks.
Researcher Affiliation Academia Yong Ren, Jialian Li, Yucen Luo, Jun Zhu Dept. of Comp. Sci. & Tech., TNList Lab; Center for Bio-Inspired Computing Research State Key Lab for Intell. Tech. & Systems, Tsinghua University, Beijing, China {renyong15, luoyc15, jl12}@mails.tsinghua.edu.cn; dcszj@tsinghua.edu.cn
Pseudocode Yes Algorithm 1 Stochastic gradient descent for CGMMN
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We first present the prediction performance on the widely used MINIST dataset... We then report the prediction performance on the Street View House Numbers (SVHN) dataset... We now show the generating results on the Extended Yale Face dataset... Due to the space limitation, we test our model on a regression problem on the Boston housing dataset.
Dataset Splits Yes The whole dataset is divided into 3 parts with 50, 000 training examples, 10, 000 validation examples and 10, 000 testing examples.
Hardware Specification No The paper does not specify any particular hardware used for running the experiments, such as specific CPU or GPU models.
Software Dependencies No The paper mentions software like "Ada M [13]" and activation functions like "Re Lu" but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes In the MLP case, the model architecture is shown in Fig. 1 with an uniform distribution for hidden variables of dimension 5... The MLP has 3 hidden layers with hidden unit number (500, 200, 100) with the Re Lu activation function. A minibatch size of 500 is adopted. In the CNN case, we use the same architecture as [18], where there are 32 feature maps in the first two convolutional layers and 64 feature maps in the last three hidden layers. An MLP of 500 hidden units is adopted at the end of convolutional layers. The Re Lu activation function is used in the convoluational layers and sigmoid function in the last layer. We do not pre-train our model and a minibatch size of 500 is adopted as well. In both settings, we use Ada M [13] to optimize parameters.