Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks

Authors: David Balduzzi, Brian McWilliams, Tony Butler-Yeoman

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on a range of optimizers, layers, and tasks provide evidence that the analysis accurately captures the dynamics of neural optimization.
Researcher Affiliation Collaboration 1Victoria University of Wellington, New Zealand 2Disney Research, Z urich, Switzerland.
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No No concrete access to source code (e.g., a specific repository link or explicit statement of code release in supplementary materials) is provided.
Open Datasets Yes Autoencoder trained on MNIST. Convnet trained on CIFAR-10.
Dataset Splits No No specific dataset split information (e.g., percentages, sample counts, or explicit mention of validation splits) is provided for reproducibility.
Hardware Specification Yes Some experiments were performed using a Tesla K80 kindly donated by Nvidia.
Software Dependencies No The paper mentions TensorFlow but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes Autoencoder trained on MNIST. Dense layers with architecture 784 ! 50 ! 30 ! 20 ! 30 ! 50 ! 784 and Re LU non-linearities. Trained with MSE loss using minibatches of 64. Convnet trained on CIFAR-10. Three convolutional layers with stack size 64 and 5 5 receptive fields, Re LU nonlinearities and 2 2 max-pooling. Followed by a 192 unit fully-connected layer with Re LU before a ten-dimensional fully-connected output layer. Trained with cross-entropy loss using minibatches of 128. Learning rates were tuned for optimal performance. Additional parameters for Adam and RMSProp were left at default.