Learning Dynamics of Linear Denoising Autoencoders
Authors: Arnu Pretorius, Steve Kroon, Herman Kamper
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We verify our theoretical predictions with simulations as well as experiments on MNIST and CIFAR-10. Furthermore, in a comparison of the learning dynamics of DAEs to standard regularised autoencoders, we show that noise has a similar regularisation effect to weight decay, but with faster training dynamics. We also show that our theoretical predictions approximate learning dynamics on real-world data and qualitatively match observed dynamics in nonlinear DAEs. |
| Researcher Affiliation | Academia | 1Computer Science Division, Stellenbosch University, South Africa 2CSIR/SU Centre for Artificial Intelligence Research, 3Department of Electrical and Electronic Engineering, Stellenbosch University, South Africa. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | *Code to reproduce all the results in this paper is available at: https://github.com/arnupretorius/lindaedynamics icml2018 |
| Open Datasets | Yes | To verify the dynamics of learning on real-world data sets we compared theoretical predictions with actual learning on MNIST and CIFAR-10. |
| Dataset Splits | No | The paper mentions training sample sizes (N=50000 for MNIST, N=30000 for CIFAR-10) and the use of a 'development set' in Section 4, but does not provide specific details on how the data was split into training, validation, and test sets for reproducibility, or if a dedicated validation set was used for hyperparameter tuning. |
| Hardware Specification | No | The paper does not provide specific hardware details (such as GPU/CPU models, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | For MNIST, we trained each autoencoder with small randomly initialised weights, using N = 50000 training samples for 5000 epochs, with a learning rate α = 0.01 and a hidden layer width of H = 256. For the WDAE, the penalty parameter was set at γ = 0.5 and for the DAE, σ2 = 0.5. Here, we trained each network with small randomly initialised weights using N = 30000 training samples for 5000 epochs, with a learning rate α = 0.001 and a hidden dimension H = 512. For the WDAE, the penalty parameter was set at γ = 0.5 and for the DAE, σ2 = 0.5. |