Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization
Authors: Yann N. Dauphin, Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, Surya Ganguli, Yoshua Bengio
NeurIPS 2014 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply this algorithm to deep or recurrent neural network training, and provide numerical evidence for its superior optimization performance. |
| Researcher Affiliation | Academia | Yann N. Dauphin Razvan Pascanu Caglar Gulcehre Kyunghyun Cho Universit e de Montr eal EMAIL, EMAIL, EMAIL, EMAIL and Surya Ganguli Stanford University EMAIL and Yoshua Bengio Universit e de Montr eal, CIFAR Fellow EMAIL |
| Pseudocode | Yes | Algorithm 1 Approximate saddle-free Newton |
| Open Source Code | No | The paper does not explicitly state that the source code for their methodology is made publicly available. |
| Open Datasets | Yes | We used a small MLP trained on a down-sampled version of MNIST and CIFAR-10. and deep autoencoder trained on (full-scale) MNIST and trained a small recurrent neural network having 120 hidden units for the task of character-level language modeling on Penn Treebank corpus. |
| Dataset Splits | No | The paper mentions training on datasets like MNIST, CIFAR-10, and Penn Treebank, and that 'The hyperparameters of SGD were selected via random search', which implies a validation process. However, it does not explicitly provide specific percentages, counts, or predefined splits for training, validation, and testing. |
| Hardware Specification | No | The paper states that 'Compute Canada, and Calcul Qu ebec for providing computational resources' were used, implying high-performance computing, but does not provide specific hardware details such as GPU or CPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions using 'Theano (Bergstra et al., 2010; Bastien et al., 2012)' but does not provide specific version numbers for Theano or any other software dependencies crucial for reproducibility. |
| Experiment Setup | Yes | The paper provides some specific experimental setup details, such as 'we used the Krylov subspace descent approach described earlier with 500 subspace vectors' for deep autoencoders, and 'trained a small recurrent neural network having 120 hidden units' for RNNs. It also notes that 'The hyperparameters of SGD were selected via random search' and 'damping coefficients... were selected from a small set at each update'. |