Learning Dynamics of Linear Denoising Autoencoders

Authors: Arnu Pretorius, Steve Kroon, Herman Kamper

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verify our theoretical predictions with simulations as well as experiments on MNIST and CIFAR-10. Furthermore, in a comparison of the learning dynamics of DAEs to standard regularised autoencoders, we show that noise has a similar regularisation effect to weight decay, but with faster training dynamics. We also show that our theoretical predictions approximate learning dynamics on real-world data and qualitatively match observed dynamics in nonlinear DAEs.
Researcher Affiliation Academia 1Computer Science Division, Stellenbosch University, South Africa 2CSIR/SU Centre for Artificial Intelligence Research, 3Department of Electrical and Electronic Engineering, Stellenbosch University, South Africa.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes *Code to reproduce all the results in this paper is available at: https://github.com/arnupretorius/lindaedynamics icml2018
Open Datasets Yes To verify the dynamics of learning on real-world data sets we compared theoretical predictions with actual learning on MNIST and CIFAR-10.
Dataset Splits No The paper mentions training sample sizes (N=50000 for MNIST, N=30000 for CIFAR-10) and the use of a 'development set' in Section 4, but does not provide specific details on how the data was split into training, validation, and test sets for reproducibility, or if a dedicated validation set was used for hyperparameter tuning.
Hardware Specification No The paper does not provide specific hardware details (such as GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup Yes For MNIST, we trained each autoencoder with small randomly initialised weights, using N = 50000 training samples for 5000 epochs, with a learning rate α = 0.01 and a hidden layer width of H = 256. For the WDAE, the penalty parameter was set at γ = 0.5 and for the DAE, σ2 = 0.5. Here, we trained each network with small randomly initialised weights using N = 30000 training samples for 5000 epochs, with a learning rate α = 0.001 and a hidden dimension H = 512. For the WDAE, the penalty parameter was set at γ = 0.5 and for the DAE, σ2 = 0.5.