Rectified Factor Networks

Authors: Djork-Arné Clevert, Andreas Mayr, Thomas Unterthiner, Sepp Hochreiter

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On benchmarks, RFNs are compared to other unsupervised methods like autoencoders, RBMs, factor analysis, ICA, and PCA. In contrast to previous sparse coding methods, RFNs yield sparser codes, capture the data s covariance structure more precisely, and have a significantly smaller reconstruction error. We test RFNs as pretraining technique for deep networks on different vision datasets, where RFNs were superior to RBMs and autoencoders. On gene expression data from two pharmaceutical drug discovery studies, RFNs detected small and rare gene modules that revealed highly relevant new biological insights which were so far missed by other unsupervised methods.
Researcher Affiliation Academia Djork-Arn e Clevert, Andreas Mayr, Thomas Unterthiner and Sepp Hochreiter Institute of Bioinformatics, Johannes Kepler University, Linz, Austria {okko,mayr,unterthiner,hochreit}@bioinf.jku.at
Pseudocode Yes Algorithm 1 Rectified Factor Network.
Open Source Code Yes RFN package for GPU/CPU is available at http://www.bioinf.jku.at/software/rfn.
Open Datasets Yes The benchmark datasets and results are taken from previous publications [25, 26, 27, 28] and contain: (i) MNIST (original MNIST), (ii) basic (a smaller subset of MNIST for training), (iii) bg-rand (MNIST with random noise background), (iv) bg-img (MNIST with random image background), (v) rect (tall or wide rectangles), (vi) rect-img (tall or wide rectangular images with random background images), (vii) convex (convex or concave shapes), (viii) CIFAR-10 (60k color images in 10 classes), and (ix) NORB (29,160 stereo image pairs of 5 categories). For each dataset its size of training, validation and test set is given in the second column of Tab. 2.
Dataset Splits Yes For each dataset its size of training, validation and test set is given in the second column of Tab. 2. As preprocessing we only performed median centering. Model selection is based on the validation set [26]. The RFNs hyperparameters are (i) the number of units per layer from {1024, 2048, 4096} and (ii) the dropout rate from {0.0, 0.25, 0.5, 0.75}. The learning rate was fixed to η = 0.01 (default value). For supervised fine-tuning with stochastic gradient descent, we selected the learning rate from {0.1, 0.01, 0.001}, the masking noise from {0.0, 0.25}, and the number of layers from {1, 3}. Fine-tuning was stopped based on the validation set, see [26].
Hardware Specification Yes The Tesla K40 used for this research was donated by the NVIDIA Corporation.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries, only mentions hyperparameters.
Experiment Setup Yes The RFNs hyperparameters are (i) the number of units per layer from {1024, 2048, 4096} and (ii) the dropout rate from {0.0, 0.25, 0.5, 0.75}. The learning rate was fixed to η = 0.01 (default value). For supervised fine-tuning with stochastic gradient descent, we selected the learning rate from {0.1, 0.01, 0.001}, the masking noise from {0.0, 0.25}, and the number of layers from {1, 3}. Fine-tuning was stopped based on the validation set, see [26].