End-to-End Transition-Based Online Dialogue Disentanglement
Authors: Hui Liu, Zhan Shi, Jia-Chen Gu, Quan Liu, Si Wei, Xiaodan Zhu
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our model on both the dataset we developed and the publicly available Ubuntu IRC dataset [Kummerfeld et al., 2019]. The results show that our model significantly outperforms the existing algorithms. |
| Researcher Affiliation | Collaboration | 1Ingenuity Labs Research Institute & ECE, Queen s University, Canada 2University of Science and Technology of China, Hefei, China 3State Key Laboratory of Cognitive Intelligence, i FLYTEK Research, Hefei, China |
| Pseudocode | No | The paper describes the model architecture and components in Section 4, but it does not include a formal pseudocode block or algorithm listing. |
| Open Source Code | Yes | 1https://github.com/layneins/e2e-dialo-disentanglement |
| Open Datasets | Yes | To contribute to the research on disentanglement, we develop a large-scale dataset from online movie scripts. ... We publish our dataset to the research community. ... and the publicly available Ubuntu IRC dataset [Kummerfeld et al., 2019]. |
| Dataset Splits | Yes | We randomly split the dataset into 29,669/2036/2010 pairs for train/dev/test. ... We separate roughly every 50 continuous messages into a group and obtain 1737/134/104 pairs for train/dev/test, respectively. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used for experiments, such as CPU or GPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions using "Glo Ve vectors [Pennington et al., 2014]" for word embedding and "Adam optimizer [Kingma and Ba, 2014]" but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We initialize word embedding using 300-dimension Glo Ve vectors [Pennington et al., 2014]. Other parameters are initialized by sampling from normal distribution with a standard deviation of 0.1. The mini-batch is 16 and size of hidden vectors in LSTM is 300. We use Adam optimizer [Kingma and Ba, 2014] with an initial learning rate of 5e-4. |