A Streaming End-to-End Framework For Spoken Language Understanding
Authors: Nihal Potdar, Anderson Raymundo Avila, Chao Xing, Dong Wang, Yiran Cao, Xiao Chen
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our solution on the Fluent Speech Commands (FSC) dataset and the intent detection accuracy is about 97 % on all multi-intent settings. |
| Researcher Affiliation | Collaboration | Nihal Potdar 1 , Anderson R. Avila2 , Chao Xing2 , Dong Wang3 , Yiran Cao 1 , Xiao Chen2 1University of Waterloo 2Huawei Noah s Ark Lab 3Tsinghua University |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code availability for the described methodology. |
| Open Datasets | Yes | The Fluent Speech Commands (FSC) dataset [Lugosch et al., 2019] was used to train and evaluate our SLU model for intent classification. ... We also used the Google Speech Commands (GSC) dataset [Warden, 2018]. |
| Dataset Splits | Yes | The whole dataset was split to three subsets: the training set (FSC-Tr) contained 14.7 hours of data, totalling 23,132 utterances from 77 speakers; the validation set (FSC-Val) and test set (FSC-Tst) comprised 1.9 and 2.4 hours of speech, leading to 3,118 utterances from 10 speakers and 3,793 utterances from other 10 speakers, respectively. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models or types) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'The Kaldi toolkit is used' and 'ADAM optimizer', but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | The model was trained using the ADAM optimizer [Loshchilov and Hutter, 2017], with the initial learning rate set to 0.0001. Dropout probability was set to 0.1 and the parameter for weight decay was set to 0.2. For the ASR pre-training, the ASR model was trained 100 epochs; for the CE pre-training, the model was trained for 10 epochs with the CE criterion. |