reproducibilityindex.ai

Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling

Authors: Hairong Liu, Zhenyao Zhu, Xiangang Li, Sanjeev Satheesh

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that the proposed Gram-CTC improves CTC in terms of both performance and efﬁciency on the large vocabulary speech recognition task at multiple scales of data, and that with Gram-CTC we can outperform the state-of-the-art on a standard speech benchmark.
Researcher Affiliation	Industry	1Baidu Silicon Valley AI Lab, 1195 Bordeaux Dr, Sunnyvale, CA 94089, USA.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	Yes	Wall Street Journal (WSJ). This corpora consists primarily of read speech with texts drawn from a machine-readable corpus of Wall Street Journal news text, and contains about 80 hours speech data. We used the standard conﬁguration of train si284 dataset for training, dev93 for validation and eval92 for testing. Fisher-Switchboard. This is a commonly used English conversational telephone speech (CTS) corpora, which contains 2300 hours CTS data.
Dataset Splits	Yes	We used the standard conﬁguration of train si284 dataset for training, dev93 for validation and eval92 for testing.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers needed to replicate the experiment.
Experiment Setup	Yes	The network inputs are thus spectral magnitude maps ranging from 0-8k Hz with 161 features per 10ms frame. At each epoch, 40% of the utterances are randomly selected to add background noise to. The optimization method we use is stochastic gradient descent with Nesterov momentum. Typical values are a learning rate of 10^-3 and momentum of 0.99.