Learning with Pseudo-Ensembles
Authors: Philip Bachman, Ouais Alsharif, Doina Precup
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We tested PEA regularization in three scenarios: supervised learning on MNIST digits, semi-supervised learning on MNIST digits, and semi-supervised transfer learning on a dataset from the NIPS 2011 Workshop on Challenges in Learning Hierarchical Models [13]. Full implementations of our methods, written with THEANO [3], and scripts/instructions for reproducing all of the results in this section are available online at: http://github.com/Philip-Bachman/Pseudo-Ensembles. |
| Researcher Affiliation | Academia | Philip Bachman Mc Gill University Montreal, QC, Canada phil.bachman@gmail.com Ouais Alsharif Mc Gill University Montreal, QC, Canada ouais.alsharif@gmail.com Doina Precup Mc Gill University Montreal, QC, Canada dprecup@cs.mcgill.ca |
| Pseudocode | No | No pseudocode or algorithm blocks were found. |
| Open Source Code | Yes | Full implementations of our methods, written with THEANO [3], and scripts/instructions for reproducing all of the results in this section are available online at: http://github.com/Philip-Bachman/Pseudo-Ensembles. All code required for these experiments is publicly available online. |
| Open Datasets | Yes | The MNIST dataset comprises 60k 28x28 grayscale hand-written digit images for training and 10k images for testing. The labeled data source was CIFAR-100 [11], which contains 50k 32x32 color images in 100 classes. The unlabeled data source was a collection of 100k 32x32 color images taken from Tiny Images [11]. We now show how the Recursive Neural Tensor Network (RNTN) from [19] can be adapted using pseudo-ensembles, and evaluate it on the Stanford Sentiment Treebank (STB) task. |
| Dataset Splits | No | The paper describes splitting training samples into labeled/unlabeled subsets and testing on a separate test set, but it does not explicitly provide details about a distinct validation set split (e.g., specific sizes, percentages, or method for creating a validation set) for model tuning or early stopping. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used, such as GPU or CPU models. It only mentions that the methods are written with THEANO. |
| Software Dependencies | No | The paper mentions "THEANO [3]" but does not specify a version number for THEANO or any other software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | For the supervised tests we used SGD hyperparameters roughly following those in [9]. We trained networks with two hidden layers of 800 nodes each, using rectified-linear activations and an ℓ2-norm constraint of 3.5 on incoming weights for each node. We initialized hidden layer biases to 0.1, output layer biases to 0, and inter-layer weights to zero-mean Gaussian noise with σ = 0.01. We trained all networks for 1000 epochs with no early-stopping. |