Regularized linear autoencoders recover the principal components, eventually

Authors: Xuchan Bao, James Lucas, Sushant Sachdeva, Roger B. Grosse

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Figure 2 and 3 show the learning dynamics of these model on the MNIST dataset [18], with k = 20. Further details can be found in Appendix H. We use full-batch training for this experiment, which is sufficient to demonstrate the symmetry breaking properties of these models. For completeness, we also show mini-batch experiments in Appendix I.2.
Researcher Affiliation Academia Xuchan Bao, James Lucas, Sushant Sachdeva, Roger Grosse University of Toronto; Vector Institute {jennybao,jlucas,sachdeva,rgrosse}@cs.toronto.edu
Pseudocode Yes Algorithm 1 Rotation augmented gradient (RAG)
Open Source Code Yes The code is available at https://github.com/Xuchan Bao/linear-ae
Open Datasets Yes Figure 2 and 3 show the learning dynamics of these model on the MNIST dataset [18], with k = 20.
Dataset Splits No The paper mentions training and testing implicitly through experiments and results, but it does not provide specific details about how the dataset was split into training, validation, and test sets (e.g., percentages, sample counts, or explicit reference to predefined splits with authors/year).
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models, memory, or other computational resources.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., libraries, frameworks like PyTorch or TensorFlow, along with their versions) that would be needed for reproducibility.
Experiment Setup No The paper mentions using "Nesterov accelerated gradient descent and the Adam optimizer [14]" and that "The learning rate for each model and optimizer has been tuned to have the fastest convergence", but it does not specify the actual numerical values of hyperparameters like the learning rate, batch size, or number of epochs used in the experiments.