Supervised autoencoders: Improving generalization performance with unsupervised regularizers
Authors: Lei Le, Andrew Patterson, Martha White
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We theoretically and empirically analyze one such model, called a supervised auto-encoder: a neural network that jointly predicts targets and inputs (reconstruction). We then demonstrate empirically that, across an array of architectures with a different number of hidden units and activation functions, the supervised auto-encoder compared to the corresponding standard neural network never harms performance and can improve generalization. |
| Researcher Affiliation | Academia | Lei Le Department of Computer Science Indiana University Bloomington, IN leile@iu.edu Andrew Patterson and Martha White Department of Computing Science University of Alberta Edmonton, AB T6G 2E8, Canada {ap3, whitem}@ualberta.ca |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | SUSY is a high-energy particle physics dataset [37]. Deterding is a vowel dataset [38] containing 11 steady-state vowels of British English spoken by 15 speakers. CIFAR-10 is an image dataset [39] with 10 classes and 60000 32x32 color images. MNIST is a dataset [40] of 70000 examples of 28x28 images of handwritten digits from 0 to 9. |
| Dataset Splits | Yes | We used 10-fold cross-validation to choose the best metaparameters for each algorithm on each dataset. For each of the training-test splits, we used a random subset of 50,000 images for training and 10,000 images for testing. |
| Hardware Specification | No | No specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments were provided in the paper. |
| Software Dependencies | No | The paper does not specify version numbers for any ancillary software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | We used 10-fold cross-validation to choose the best metaparameters for each algorithm on each dataset. We use a network with two convolutional layers of sizes {32, 64} and 4 dense layers of sizes {2048, 512, 128, 32} with Re Lu activation. We focus on the impact of using reconstruction error, and compare SAE and NN with a variety of nonlinear structures, including sigmoid (SAE-Sigmoid and NN-Sigmoid), Re Lu (SAE-Re Lu and NN-Re Lu) and Gaussian kernel (SAE-Kernel and NN-Kernel). |