Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Fully Neural Network Based Speech Recognition on Mobile and Embedded Devices

Authors: Jinhwan Park, Yoonho Boo, Iksoo Choi, Sungho Shin, Wonyong Sung

NeurIPS 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present real-time speech recognition on smartphones or embedded systems by employing recurrent neural network (RNN) based acoustic models, RNN based language models, and beam-search decoding. The experimental results including the execution time analysis are shown in Section 4. Table 1 shows the CER and WER performance of the RNN models trained with the WSJ SI-284 training set.
Researcher Affiliation	Academia	Jinhwan Park Seoul National University EMAIL, Yoonho Boo Seoul National University EMAIL, Iksoo Choi Seoul National University EMAIL, Sungho Shin Seoul National University EMAIL, Wonyong Sung Seoul National University EMAIL
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. It provides equations and describes algorithms in text, for example in Appendix C which is titled "Details of Decoding Algorithm" but does not show pseudocode.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. There is no repository link or explicit code release statement.
Open Datasets	Yes	We used Wall Street Journal (WSJ) SI-284 training set (81 hours) for the fast evaluation of AMs. We also trained our system using a larger dataset, Librispeech Corpus [33]. [33] Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. Librispeech: an ASR corpus based on public domain audio books. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, pages 5206 5210. IEEE, 2015.
Dataset Splits	Yes	We randomly selected 5% of WSJ LM training text to the valid set, and another 5% to the test set. The remaining 90% of the text is used for training RNN LM.
Hardware Specification	Yes	The implementation operates in real-time on the ARM Cortex-A57 based embedded system without GPU support. The ARM CPU has 80 KB L1 data cache and 2,048 KB L2 cache.
Software Dependencies	No	Open BLAS library [34] is used for the optimization of computation. For 8-bit implementation, gemmlowp library [35] is employed. The paper mentions the use of Open BLAS and gemmlowp libraries but does not provide specific version numbers for these software components.
Experiment Setup	Yes	The width of 1-D convolution is set to 15, which seems to be the optimum number at our experiments. We applied batch normalization [28] to the first two convolutional layers and variational dropout [29] to every output of the recurrent layer for regularization. Adam optimizer [30] was applied for training. We used an initial learning rate of 3e-4, and the learning rate was reduced to half if the validation error was not lowered for consecutive 8 epochs. Gradient clipping with a maximum norm of 4.0 was applied.