Acquiring Speech Transcriptions Using Mismatched Crowdsourcing

Authors: Preethi Jyothi, Mark Hasegawa-Johnson

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the feasibility of our technique using an isolated word recovery task for Hindi we predict transcriptions for isolated words in Hindi using mismatched transcriptions from crowd workers unfamiliar with Hindi. We successfully recover more than 85% of the words (and more than 94% in a 4-best list). 4 Experiments Experimental Setup
Researcher Affiliation Academia Preethi Jyothi and Mark Hasegawa-Johnson Beckman Institute for Advanced Science and Technology University of Illinois at Urbana-Champaign 405 N. Mathews, Urbana, Illinois 61801
Pseudocode No The paper describes algorithms but does not include a clearly labeled pseudocode block or algorithm section.
Open Source Code No The paper mentions using and provides links for third-party tools (Carmel and Open Fst toolkits) but does not state that the code for their own methodology is open-source or provide a link to it.
Open Datasets Yes We extracted Hindi speech from Special Broadcasting Service (SBS, Australia) radio podcasts7 consisting of mostly spontaneous, semi-formal speech. ... We created a vocabulary comprising all the words in our data, along with the 1000 most frequent words from Hindi monolingual text in the EMILLE corpus (Baker et al. 2002).
Dataset Splits No The paper mentions 'training set' and 'evaluation set' and that they did not overlap, but does not provide specific percentages or counts for a train/validation/test split.
Hardware Specification No The paper does not provide specific details on the hardware used for experiments (e.g., CPU/GPU models, memory).
Software Dependencies No The paper mentions using 'USC/ISI Carmel finite-state toolkit' and 'Open Fst toolkit' but does not specify version numbers for these or any other software dependencies.
Experiment Setup Yes As the scaling function F, we use the square root function, i.e., F(α) = √α. The weights on the arcs of the FST model are negative log probabilities; these are learned using EM to maximize the likelihood of the observed data.