Rectified Factor Networks
Authors: Djork-Arné Clevert, Andreas Mayr, Thomas Unterthiner, Sepp Hochreiter
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On benchmarks, RFNs are compared to other unsupervised methods like autoencoders, RBMs, factor analysis, ICA, and PCA. In contrast to previous sparse coding methods, RFNs yield sparser codes, capture the data s covariance structure more precisely, and have a significantly smaller reconstruction error. We test RFNs as pretraining technique for deep networks on different vision datasets, where RFNs were superior to RBMs and autoencoders. On gene expression data from two pharmaceutical drug discovery studies, RFNs detected small and rare gene modules that revealed highly relevant new biological insights which were so far missed by other unsupervised methods. |
| Researcher Affiliation | Academia | Djork-Arn e Clevert, Andreas Mayr, Thomas Unterthiner and Sepp Hochreiter Institute of Bioinformatics, Johannes Kepler University, Linz, Austria {okko,mayr,unterthiner,hochreit}@bioinf.jku.at |
| Pseudocode | Yes | Algorithm 1 Rectified Factor Network. |
| Open Source Code | Yes | RFN package for GPU/CPU is available at http://www.bioinf.jku.at/software/rfn. |
| Open Datasets | Yes | The benchmark datasets and results are taken from previous publications [25, 26, 27, 28] and contain: (i) MNIST (original MNIST), (ii) basic (a smaller subset of MNIST for training), (iii) bg-rand (MNIST with random noise background), (iv) bg-img (MNIST with random image background), (v) rect (tall or wide rectangles), (vi) rect-img (tall or wide rectangular images with random background images), (vii) convex (convex or concave shapes), (viii) CIFAR-10 (60k color images in 10 classes), and (ix) NORB (29,160 stereo image pairs of 5 categories). For each dataset its size of training, validation and test set is given in the second column of Tab. 2. |
| Dataset Splits | Yes | For each dataset its size of training, validation and test set is given in the second column of Tab. 2. As preprocessing we only performed median centering. Model selection is based on the validation set [26]. The RFNs hyperparameters are (i) the number of units per layer from {1024, 2048, 4096} and (ii) the dropout rate from {0.0, 0.25, 0.5, 0.75}. The learning rate was fixed to η = 0.01 (default value). For supervised fine-tuning with stochastic gradient descent, we selected the learning rate from {0.1, 0.01, 0.001}, the masking noise from {0.0, 0.25}, and the number of layers from {1, 3}. Fine-tuning was stopped based on the validation set, see [26]. |
| Hardware Specification | Yes | The Tesla K40 used for this research was donated by the NVIDIA Corporation. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries, only mentions hyperparameters. |
| Experiment Setup | Yes | The RFNs hyperparameters are (i) the number of units per layer from {1024, 2048, 4096} and (ii) the dropout rate from {0.0, 0.25, 0.5, 0.75}. The learning rate was fixed to η = 0.01 (default value). For supervised fine-tuning with stochastic gradient descent, we selected the learning rate from {0.1, 0.01, 0.001}, the masking noise from {0.0, 0.25}, and the number of layers from {1, 3}. Fine-tuning was stopped based on the validation set, see [26]. |