reproducibilityindex.ai

Towards End-To-End Speech Recognition with Recurrent Neural Networks

Authors: Alex Graves, Navdeep Jaitly

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the Wall Street Journal speech corpus demonstrate that the system is able to recognise words to reasonable accuracy, even in the absence of a language model or dictionary, and that when combined with a language model it performs comparably to a state-of-the-art pipeline. The experiments were carried out on the Wall Street Journal (WSJ) corpus (available as LDC corpus LDC93S6B and LDC94S13B).
Researcher Affiliation	Collaboration	Alex Graves GRAVES@CS.TORONTO.EDU Google Deep Mind, London, United Kingdom Navdeep Jaitly NDJAITLY@CS.TORONTO.EDU Department of Computer Science, University of Toronto, Canada
Pseudocode	Yes	Algorithm 1 CTC Beam Search
Open Source Code	No	The paper does not provide any explicit statements about releasing source code for the described methodology, nor does it include links to a code repository.
Open Datasets	Yes	The experiments were carried out on the Wall Street Journal (WSJ) corpus (available as LDC corpus LDC93S6B and LDC94S13B).
Dataset Splits	Yes	The RNN was trained on both the 14 hour subset train-si84 and the full 81 hour set, with the test-dev93 development set used for validation.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, or cloud instance types used for running experiments.
Software Dependencies	No	The paper mentions 'matplotlib python toolkit' and 'Kaldi recipe s5' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	The network had ﬁve levels of bidirectional LSTM hidden layers, with 500 cells in each layer, giving a total of 26.5M weights. It was trained using stochastic gradient descent with one weight update per utterance, a learning rate of 10 4 and a momentum of 0.9. The DNN was trained with stochastic gradient descent, starting with a learning rate of 0.1, and momentum of 0.9.