Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Towards End-To-End Speech Recognition with Recurrent Neural Networks
Authors: Alex Graves, Navdeep Jaitly
ICML 2014 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on the Wall Street Journal speech corpus demonstrate that the system is able to recognise words to reasonable accuracy, even in the absence of a language model or dictionary, and that when combined with a language model it performs comparably to a state-of-the-art pipeline. The experiments were carried out on the Wall Street Journal (WSJ) corpus (available as LDC corpus LDC93S6B and LDC94S13B). |
| Researcher Affiliation | Collaboration | Alex Graves EMAIL Google Deep Mind, London, United Kingdom Navdeep Jaitly EMAIL Department of Computer Science, University of Toronto, Canada |
| Pseudocode | Yes | Algorithm 1 CTC Beam Search |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code for the described methodology, nor does it include links to a code repository. |
| Open Datasets | Yes | The experiments were carried out on the Wall Street Journal (WSJ) corpus (available as LDC corpus LDC93S6B and LDC94S13B). |
| Dataset Splits | Yes | The RNN was trained on both the 14 hour subset train-si84 and the full 81 hour set, with the test-dev93 development set used for validation. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or cloud instance types used for running experiments. |
| Software Dependencies | No | The paper mentions 'matplotlib python toolkit' and 'Kaldi recipe s5' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | The network had five levels of bidirectional LSTM hidden layers, with 500 cells in each layer, giving a total of 26.5M weights. It was trained using stochastic gradient descent with one weight update per utterance, a learning rate of 10 4 and a momentum of 0.9. The DNN was trained with stochastic gradient descent, starting with a learning rate of 0.1, and momentum of 0.9. |