reproducibilityindex.ai

Optimal Completion Distillation for Sequence Learning

Authors: Sara Sabour, William Chan, Mohammad Norouzi

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	OCD achieves the state-of-the-art performance on end-to-end speech recognition, on both Wall Street Journal and Librispeech datasets, achieving 9.3% and 4.5% word error rates, respectively.
Researcher Affiliation	Industry	Sara Sabour, William Chan, Mohammad Norouzi {sasabour, williamchan, mnorouzi}@google.com Google Brain
Pseudocode	Yes	Procedure 1 Edit Distance Q op returns Q-values of the tokens at each time step based on the minimum edit distance between a reference sequence r and a hypothesis sequence h of length t.
Open Source Code	No	We are in the process of releasing the code for OCD.
Open Datasets	Yes	We conduct our experiments on speech recogntion on the Wall Street Journal (WSJ) (Paul and Baker, 1992) and Librispeech (Panayotov et al., 2015) benchmarks.
Dataset Splits	Yes	We use the standard conﬁguration of si284 for training, dev93 for validation and report both test Character Error Rate (CER) and Word Error Rate (WER) on eval92. [...] For the Librispeech dataset, we train on the full training set (960h audio data) and validate our results on the dev-other set.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'Tensor Flow (Abadi et al., 2016)' but does not specify a version number for the software itself, nor for any other libraries or dependencies.
Experiment Setup	Yes	Our encoder uses 2-layers of convolutions with 3x3 ﬁlters, stride 2x2 and 32 channels, followed by a convolutional LSTM with 1D-convolution of ﬁlter width 3, followed by 3 LSTM layers with 256 cell size. [...] train our models for 300 epochs of batch size 8 with 8 async workers. We separately tune the learning rate for our baseline and OCD model, 0.0007 for OCD vs 0.001 for baseline.