reproducibilityindex.ai

End-to-End Deep Reinforcement Learning for Conversation Disentanglement

Authors: Karan Bhukar, Harshit Kumar, Dinesh Raghu, Ajay Gupta

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through experiments on the Ubuntu IRC dataset, we demonstrate that the proposed RL model improves the performance on both link-level and conversation-level metrics. We evaluate the proposed RL based approach on the widely used Ubuntu IRC dataset (Kummerfeld et al. 2018). We find that our RL based approach that uses our novel TL-FBC metric as reward is significantly better than baselines on both link-level and conversation-level metrics.
Researcher Affiliation	Industry	Karan Bhukar1, Harshit Kumar1, Dinesh Raghu1, Ajay Gupta*2 1 IBM Research 2 Meta karan.bhukar1@ibm.com, harshitk@in.ibm.com, diraghu1@in.ibm.com, guptaajay@fb.com
Pseudocode	No	The paper describes its methodology using text and mathematical equations but does not contain a structured pseudocode or algorithm block with a clear label such as 'Algorithm' or 'Pseudocode'.
Open Source Code	Yes	1https://github.com/karan121bhukar/RL-ConvDisentanglement
Open Datasets	Yes	We use Ubuntu IRC (Internet Relay Chat) (Kummerfeld et al. 2018), the most widely used conversation disentanglement dataset, for our experiments.
Dataset Splits	Yes	Table 1: Statistics of Ubuntu IRC dataset Train 220,463 Messages Dev 12,500 Messages Test 15,000 Messages
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments, only mentioning the use of PyTorch.
Software Dependencies	No	The paper mentions using 'Py-Torch (Paszke et al. 2019)' but does not provide specific version numbers for PyTorch or other software dependencies.
Experiment Setup	Yes	The set of hyper-parameters that give best results with learning rate and input sequence length set to 5e-6 and 128 respectively. We set the the number of trajectories N and the candidate parents window size w to 10 and 50 respectively.