Adapting Translation Models for Transcript Disfluency Detection

Authors: Qianqian Dong, Feng Wang, Zhen Yang, Wei Chen, Shuang Xu, Bo Xu6351-6358

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on the publicly available set, Switchboard, and in-house Chinese set. Experimental results show that the proposed model significantly outperforms previous state-of-the-art models.
Researcher Affiliation Collaboration Qianqian Dong,1,2 Feng Wang,1 Zhen Yang,1,2 Wei Chen,1 Shuang Xu,1 Bo Xu1,2 1Institute of Automation, Chinese Academy of Sciences, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China
Pseudocode No Not found. The paper describes processes and architectures but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Our models are implemented with Tensor Flow5. 5https://github.com/dqqcasia/TranslationDisfluencyDetection
Open Datasets Yes To directly compare with previous state-of-the-art results in the filed of TDD, we limit our training data strictly to public resources. Our training data includes the Switchboard disfluency-annotated corpus (Switchboard portion) of the English Penn Treebank and an in-house Chinese dataset... Following the experiment settings in (Charniak and Johnson 2001; Honnibal and Johnson 2014; Wu et al. 2015)... The details for our Chinese TDD dataset and our annotation rules is available online 4. 4https://github.com/dqqcasia/TranslationDisfluencyDetection/tree/master/data/chinese_disfluency
Dataset Splits Yes Following the experiment settings in (Charniak and Johnson 2001; Honnibal and Johnson 2014; Wu et al. 2015), we use directory 2 and 3 in PARSED/MRG/SWBD as our training set and split directory 4 into test set, development set, and others. The development data consists of all sw4[5-9]*.dps files... Table 1: The statistics on the training set, development set, and test set in Switchboard.
Hardware Specification Yes Our models are trained for a max 200000 steps on 2 NVIDIA Titan-X GPUs.
Software Dependencies No Not found. The paper mentions 'Tensor Flow' but does not specify its version or the versions of any other software libraries.
Experiment Setup Yes We use the hyperparameter settings of the base Transformer model described in Vaswani et al. (2017) for encoder stack and decoder stack. We share encoder and decoder word embeddings during training and inference... we use a shared word-level vocabulary of 20000. For Chinese corpus, we use a shared character-level vocabulary of 3000... Sentence pairs are batched together by approximate sequence length. Each batch contains a set of sentence pairs with approximately 7000 source tokens and target tokens. Our models are trained for a max 200000 steps on 2 NVIDIA Titan-X GPUs.