reproducibilityindex.ai

An Online Sequence-to-Sequence Model Using Partial Conditioning

Authors: Navdeep Jaitly, Quoc V. Le, Oriol Vinyals, Ilya Sutskever, David Sussillo, Samy Bengio

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that the Neural Transducer works well in settings where it is required to produce output predictions as data come in. We also ﬁnd that the Neural Transducer performs well for long sequences even when attention mechanisms are not used. ... On the TIMIT phoneme recognition task, a Neural Transducer (with 3 layered unidirectional LSTM encoder and 3 layered unidirectional LSTM transducer) can achieve an accuracy of 20.8% phoneme error rate (PER) which is close to state-of-the-art for unidirectional models.
Researcher Affiliation	Industry	Navdeep Jaitly Google Brain ndjaitly@google.com David Sussillo Google Brain sussillo@google.com Quoc V. Le Google Brain qvl@google.com Oriol Vinyals Google Deep Mind vinyals@google.com Ilya Sutskever Open AI ilyasu@openai.com Samy Bengio Google Brain bengio@google.com
Pseudocode	No	The paper describes algorithms in text (e.g., dynamic programming for alignment, beam search for inference), but does not present structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access information (e.g., repository link, explicit statement of code release, or mention of code in supplementary materials) for the described methodology.
Open Datasets	Yes	We used TIMIT, a standard benchmark for speech recognition, for our larger experiments.
Dataset Splits	Yes	We used TIMIT, a standard benchmark for speech recognition, for our larger experiments. ... Note the TIMIT provides a validation set, called the dev set. We use these terms interchangeably.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using the 'Kaldi toolkit' to generate alignments but does not specify its version number or any other software dependencies with version numbers.
Experiment Setup	Yes	We used stochastic gradient descent with momentum with a batch size of one utterance per training step. An initial learning rate of 0.05, and momentum of 0.9 was used. The learning rate was reduced by a factor of 0.5 every time the average log prob over the validation set decreased 7. The decrease was applied for a maximum of 4 times. The models were trained for 50 epochs and the parameters from the epochs with the best dev set log prob were used for decoding.