End-to-End Transition-Based Online Dialogue Disentanglement

Authors: Hui Liu, Zhan Shi, Jia-Chen Gu, Quan Liu, Si Wei, Xiaodan Zhu

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our model on both the dataset we developed and the publicly available Ubuntu IRC dataset [Kummerfeld et al., 2019]. The results show that our model significantly outperforms the existing algorithms.
Researcher Affiliation Collaboration 1Ingenuity Labs Research Institute & ECE, Queen s University, Canada 2University of Science and Technology of China, Hefei, China 3State Key Laboratory of Cognitive Intelligence, i FLYTEK Research, Hefei, China
Pseudocode No The paper describes the model architecture and components in Section 4, but it does not include a formal pseudocode block or algorithm listing.
Open Source Code Yes 1https://github.com/layneins/e2e-dialo-disentanglement
Open Datasets Yes To contribute to the research on disentanglement, we develop a large-scale dataset from online movie scripts. ... We publish our dataset to the research community. ... and the publicly available Ubuntu IRC dataset [Kummerfeld et al., 2019].
Dataset Splits Yes We randomly split the dataset into 29,669/2036/2010 pairs for train/dev/test. ... We separate roughly every 50 continuous messages into a group and obtain 1737/134/104 pairs for train/dev/test, respectively.
Hardware Specification No The paper does not explicitly describe the specific hardware used for experiments, such as CPU or GPU models, or memory specifications.
Software Dependencies No The paper mentions using "Glo Ve vectors [Pennington et al., 2014]" for word embedding and "Adam optimizer [Kingma and Ba, 2014]" but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes We initialize word embedding using 300-dimension Glo Ve vectors [Pennington et al., 2014]. Other parameters are initialized by sampling from normal distribution with a standard deviation of 0.1. The mini-batch is 16 and size of hidden vectors in LSTM is 300. We use Adam optimizer [Kingma and Ba, 2014] with an initial learning rate of 5e-4.