Bottleneck Conditional Density Estimation
Authors: Rui Shu, Hung H. Bui, Mohammad Ghavamzadeh
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that our hybrid training procedure enables models to achieve competitive results in the MNIST quadrant prediction task in the fully-supervised setting, and sets new benchmarks in the semi-supervised regime for MNIST, SVHN, and Celeb A. |
| Researcher Affiliation | Collaboration | Rui Shu 1 Stanford University 2Adobe Research 3Deep Mind (The work was done when all the authors were with Adobe Research). Correspondence to: Rui Shu <ruishu@stanford.edu>. |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | github.com/ruishu/bcde |
| Open Datasets | Yes | We evaluated the performance of our hybrid training procedure on the permutation-invariant quadrant prediction task (Sohn et al., 2014; Sohn et al., 2015) for MNIST, SVHN, and Celeb A. |
| Dataset Splits | Yes | In the fully-supervised case, the original MNIST training set {x0 i=1 is converted into our CDE training set {Xl, Yl} = {xi, yi}50000 i=1 by splitting each image into its observed x and unobserved y regions according to the quadrant prediction task. In all cases, we extracted a validation set of 10000 samples for hyperparameter tuning. |
| Hardware Specification | No | No specific hardware details (GPU models, CPU types, memory amounts) were mentioned for running the experiments. The paper only states that models were implemented using TensorFlow and mentions types of neural networks used. |
| Software Dependencies | No | All models were implemented in Python3 using Tensorflow (Abadi, 2015). No specific version numbers for TensorFlow or other libraries are given. |
| Experiment Setup | Yes | All neural networks are batch-normalized (Ioffe & Szegedy, 2015) and updated with Adam (Kingma & Ba, 2014). The number of training epochs is determined based on the validation set. The dimensionality of each stochastic layer is 50, 100, and 300 for MNIST, Celeb A, and SVHN respectively. We tuned the regularization hyperparameter λ = 10 3, 10 2, . . . , 103 on the MNIST 2-quadrant semi-supervised tasks and settled on using λ = 10 2 for all tasks. |