reproducibilityindex.ai

Zero-bias autoencoders and the benefits of co-adapting features

Authors: Kishore Konda, Roland Memisevic, and David Krueger

ICLR 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This work is motivated by the empirical observation that across a wide range of applications, hidden biases, bk, tend to take on large negative values when training an autoencoder with one of the mentioned regularization schemes. In Figure 1 we conﬁrm this ﬁnding, and we show that it is still true when features represent whole CIFAR-10 images (rather than a bag of features). The ﬁgure shows the classiﬁcation performance of a standard contractive autoencoder with sigmoid hidden units trained on the permutation-invariant CIFAR-10 training dataset (ie. using the whole images not patches for training), using a linear classiﬁer applied to the hidden activations.
Researcher Affiliation	Academia	Kishore Konda Goethe University Frankfurt Germany konda.kishorereddy@gmail.com Roland Memisevic University of Montreal Canada roland.memisevic@umontreal.ca David Krueger University of Montreal Canada david.krueger@umontreal.ca
Pseudocode	No	The paper describes algorithms and mathematical formulations but does not include any clearly labeled pseudocode blocks or algorithm listings.
Open Source Code	Yes	An example implementation of the zero-bias autoencoder in python is available at http://www.iro. umontreal.ca/ memisevr/code/zae/.
Open Datasets	Yes	We chose the CIFAR-10 dataset (Krizhevsky & Hinton (2009)). It contains color images of size 32 × 32 pixels that are assigned to 10 different classes. The number of samples for training is 50, 000 and for testing is 10, 000. We used the recognition pipeline proposed in Le et al. (2011); Konda et al. (2014) and evaluated it on the Hollywood2 dataset Marszałek et al. (2009).
Dataset Splits	Yes	The number of samples for training is 50, 000 and for testing is 10, 000. We classify the resulting representation using logistic regression with weight decay for classiﬁcation, with weight cost parameter estimated using cross-validation on a subset of the training samples of size 10000.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running its experiments.
Software Dependencies	No	The paper mentions that an implementation is available in Python but does not specify version numbers for Python or any other software libraries or frameworks used in the experiments.
Experiment Setup	Yes	For all the experiments in this section we chose a learning rate of 0.0001 for a few (e.g. 3) initial training epochs, and then increased it to 0.001. This is to ensure that scaling issues in the initializing are dealt with at the outset, and to help avoid any blow-ups during training. Each model is trained for 1000 epochs in total with a ﬁxed momentum of 0.9. The threshold parameter θ is ﬁxed to 1.0 for both the TRec and TLin autoencoder.