Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks
Authors: David Balduzzi, Brian McWilliams, Tony Butler-Yeoman
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on a range of optimizers, layers, and tasks provide evidence that the analysis accurately captures the dynamics of neural optimization. |
| Researcher Affiliation | Collaboration | 1Victoria University of Wellington, New Zealand 2Disney Research, Z urich, Switzerland. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | No concrete access to source code (e.g., a specific repository link or explicit statement of code release in supplementary materials) is provided. |
| Open Datasets | Yes | Autoencoder trained on MNIST. Convnet trained on CIFAR-10. |
| Dataset Splits | No | No specific dataset split information (e.g., percentages, sample counts, or explicit mention of validation splits) is provided for reproducibility. |
| Hardware Specification | Yes | Some experiments were performed using a Tesla K80 kindly donated by Nvidia. |
| Software Dependencies | No | The paper mentions TensorFlow but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | Autoencoder trained on MNIST. Dense layers with architecture 784 ! 50 ! 30 ! 20 ! 30 ! 50 ! 784 and Re LU non-linearities. Trained with MSE loss using minibatches of 64. Convnet trained on CIFAR-10. Three convolutional layers with stack size 64 and 5 5 receptive fields, Re LU nonlinearities and 2 2 max-pooling. Followed by a 192 unit fully-connected layer with Re LU before a ten-dimensional fully-connected output layer. Trained with cross-entropy loss using minibatches of 128. Learning rates were tuned for optimal performance. Additional parameters for Adam and RMSProp were left at default. |