End-to-End Deep Reinforcement Learning for Conversation Disentanglement
Authors: Karan Bhukar, Harshit Kumar, Dinesh Raghu, Ajay Gupta
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments on the Ubuntu IRC dataset, we demonstrate that the proposed RL model improves the performance on both link-level and conversation-level metrics. We evaluate the proposed RL based approach on the widely used Ubuntu IRC dataset (Kummerfeld et al. 2018). We find that our RL based approach that uses our novel TL-FBC metric as reward is significantly better than baselines on both link-level and conversation-level metrics. |
| Researcher Affiliation | Industry | Karan Bhukar1, Harshit Kumar1, Dinesh Raghu1, Ajay Gupta*2 1 IBM Research 2 Meta karan.bhukar1@ibm.com, harshitk@in.ibm.com, diraghu1@in.ibm.com, guptaajay@fb.com |
| Pseudocode | No | The paper describes its methodology using text and mathematical equations but does not contain a structured pseudocode or algorithm block with a clear label such as 'Algorithm' or 'Pseudocode'. |
| Open Source Code | Yes | 1https://github.com/karan121bhukar/RL-ConvDisentanglement |
| Open Datasets | Yes | We use Ubuntu IRC (Internet Relay Chat) (Kummerfeld et al. 2018), the most widely used conversation disentanglement dataset, for our experiments. |
| Dataset Splits | Yes | Table 1: Statistics of Ubuntu IRC dataset Train 220,463 Messages Dev 12,500 Messages Test 15,000 Messages |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments, only mentioning the use of PyTorch. |
| Software Dependencies | No | The paper mentions using 'Py-Torch (Paszke et al. 2019)' but does not provide specific version numbers for PyTorch or other software dependencies. |
| Experiment Setup | Yes | The set of hyper-parameters that give best results with learning rate and input sequence length set to 5e-6 and 128 respectively. We set the the number of trajectories N and the candidate parents window size w to 10 and 50 respectively. |