Bottleneck Conditional Density Estimation

Authors: Rui Shu, Hung H. Bui, Mohammad Ghavamzadeh

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that our hybrid training procedure enables models to achieve competitive results in the MNIST quadrant prediction task in the fully-supervised setting, and sets new benchmarks in the semi-supervised regime for MNIST, SVHN, and Celeb A.
Researcher Affiliation Collaboration Rui Shu 1 Stanford University 2Adobe Research 3Deep Mind (The work was done when all the authors were with Adobe Research). Correspondence to: Rui Shu <ruishu@stanford.edu>.
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes github.com/ruishu/bcde
Open Datasets Yes We evaluated the performance of our hybrid training procedure on the permutation-invariant quadrant prediction task (Sohn et al., 2014; Sohn et al., 2015) for MNIST, SVHN, and Celeb A.
Dataset Splits Yes In the fully-supervised case, the original MNIST training set {x0 i=1 is converted into our CDE training set {Xl, Yl} = {xi, yi}50000 i=1 by splitting each image into its observed x and unobserved y regions according to the quadrant prediction task. In all cases, we extracted a validation set of 10000 samples for hyperparameter tuning.
Hardware Specification No No specific hardware details (GPU models, CPU types, memory amounts) were mentioned for running the experiments. The paper only states that models were implemented using TensorFlow and mentions types of neural networks used.
Software Dependencies No All models were implemented in Python3 using Tensorflow (Abadi, 2015). No specific version numbers for TensorFlow or other libraries are given.
Experiment Setup Yes All neural networks are batch-normalized (Ioffe & Szegedy, 2015) and updated with Adam (Kingma & Ba, 2014). The number of training epochs is determined based on the validation set. The dimensionality of each stochastic layer is 50, 100, and 300 for MNIST, Celeb A, and SVHN respectively. We tuned the regularization hyperparameter λ = 10 3, 10 2, . . . , 103 on the MNIST 2-quadrant semi-supervised tasks and settled on using λ = 10 2 for all tasks.