reproducibilityindex.ai

Temporal Ensembling for Semi-Supervised Learning

Authors: Samuli Laine, Timo Aila

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Using our method, we set new records for two standard semi-supervised learning benchmarks, reducing the (non-augmented) classiﬁcation error rate from 18.44% to 7.05% in SVHN with 500 labels and from 18.63% to 16.55% in CIFAR-10 with 4000 labels, and further to 5.12% and 12.16% by enabling the standard augmentations.
Researcher Affiliation	Industry	Samuli Laine NVIDIA slaine@nvidia.com Timo Aila NVIDIA taila@nvidia.com
Pseudocode	Yes	Algorithm 1 Π-model pseudocode. Require: xi = training stimuli Require: L = set of training input indices with known labels Require: yi = labels for labeled inputs i L Require: w(t) = unsupervised weight ramp-up function Require: fθ(x) = stochastic neural network with trainable parameters θ Require: g(x) = stochastic input augmentation function for t in [1, num epochs] do for each minibatch B do zi B fθ(g(xi B)) evaluate network outputs for augmented inputs zi B fθ(g(xi B)) again, with different dropout and augmentation loss 1 \|B\| P i (B L) log zi[yi] supervised loss component + w(t) 1 C\|B\| P i B \|\|zi zi\|\|2 unsupervised loss component update θ using, e.g., ADAM update network parameters end for end for return θ
Open Source Code	Yes	Our implementation is written in Python using Theano (Theano Development Team, 2016) and Lasagne (Dieleman et al., 2015), and is available at https://github.com/smlaine2/tempens.
Open Datasets	Yes	We test the Π-model and temporal ensembling in two image classiﬁcation tasks, CIFAR-10 and SVHN, and report the mean and standard deviation of 10 runs using different random seeds. The CIFAR-100 dataset consists of 32 × 32 pixel RGB images from a hundred classes.
Dataset Splits	No	The paper refers to 'training stimuli' and 'labeled inputs' for training, and uses 'test' sets for evaluation, but does not explicitly describe a separate validation set split, its size, or how it's partitioned from the training data.
Hardware Specification	No	The paper describes the software implementation details, but does not provide any specific hardware specifications such as GPU or CPU models used for running the experiments.
Software Dependencies	No	Our implementation is written in Python using Theano (Theano Development Team, 2016) and Lasagne (Dieleman et al., 2015). While software is mentioned, specific version numbers for Theano and Lasagne are not provided.
Experiment Setup	Yes	All networks were trained using Adam (Kingma & Ba, 2014) with a maximum learning rate of λmax = 0.003, except for temporal ensembling in the SVHN case where a maximum learning rate of λmax = 0.001 worked better. Adam momentum parameters were set to β1 = 0.9 and β2 = 0.999 as suggested in the paper. The maximum value for the unsupervised loss component was set to wmax M/N, where M is the number of labeled inputs and N is the total number of training inputs. For Π-model runs, we used wmax = 100 in all runs except for CIFAR-100 with Tiny Images where we set wmax = 300. For temporal ensembling we used wmax = 30 in most runs. All networks were trained for 300 epochs with minibatch size of 100.