Regularized linear autoencoders recover the principal components, eventually
Authors: Xuchan Bao, James Lucas, Sushant Sachdeva, Roger B. Grosse
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Figure 2 and 3 show the learning dynamics of these model on the MNIST dataset [18], with k = 20. Further details can be found in Appendix H. We use full-batch training for this experiment, which is sufficient to demonstrate the symmetry breaking properties of these models. For completeness, we also show mini-batch experiments in Appendix I.2. |
| Researcher Affiliation | Academia | Xuchan Bao, James Lucas, Sushant Sachdeva, Roger Grosse University of Toronto; Vector Institute {jennybao,jlucas,sachdeva,rgrosse}@cs.toronto.edu |
| Pseudocode | Yes | Algorithm 1 Rotation augmented gradient (RAG) |
| Open Source Code | Yes | The code is available at https://github.com/Xuchan Bao/linear-ae |
| Open Datasets | Yes | Figure 2 and 3 show the learning dynamics of these model on the MNIST dataset [18], with k = 20. |
| Dataset Splits | No | The paper mentions training and testing implicitly through experiments and results, but it does not provide specific details about how the dataset was split into training, validation, and test sets (e.g., percentages, sample counts, or explicit reference to predefined splits with authors/year). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models, memory, or other computational resources. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., libraries, frameworks like PyTorch or TensorFlow, along with their versions) that would be needed for reproducibility. |
| Experiment Setup | No | The paper mentions using "Nesterov accelerated gradient descent and the Adam optimizer [14]" and that "The learning rate for each model and optimizer has been tuned to have the fastest convergence", but it does not specify the actual numerical values of hyperparameters like the learning rate, batch size, or number of epochs used in the experiments. |