reproducibilityindex.ai

RefNet: A Reference-Aware Network for Background Based Conversation

Authors: Chuan Meng, Pengjie Ren, Zhumin Chen, Christof Monz, Jun Ma, Maarten de Rijke8496-8503

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that Ref Net signiﬁcantly outperforms state-of-the-art methods in terms of both automatic and human evaluations, indicating that Ref Net can generate more appropriate and human-like responses.
Researcher Affiliation	Academia	Chuan Meng,1 Pengjie Ren,2 Zhumin Chen,1 Christof Monz,2 Jun Ma,1 Maarten de Rijke2 1Shandong University, Qingdao, China, 2University of Amsterdam, Amsterdam, The Netherlands
Pseudocode	No	The paper does not include any pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available online.1 1https://github.com/Chuan Meng/Ref Net
Open Datasets	Yes	We choose the Holl-E dataset released by Moghe et al. (2018) because it contains boundary annotations of the background information used for each response.
Dataset Splits	Yes	We follow the original data split for training, validation and test. There are also two versions of the test set: one with single golden reference (SR) and the other with multiple golden references (MR); see (Moghe et al. 2018).
Hardware Specification	No	The paper does not provide any specific hardware details like GPU models, CPU types, or memory.
Software Dependencies	No	The paper mentions optimizers (Adam) and network architectures (GRU) but does not list specific software libraries or their version numbers (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup	Yes	We set the word embedding size and GRU hidden state size to 128 and 256, respectively. The vocabulary size is limited to 25,000. For fair comparison, all models use the same embedding size, hidden state size and vocabulary size. Following Moghe et al. (2018), we limit the context length of all models to 65. We train all models for 30 epochs and test on a validation set after each epoch, and select the best model based on the validation results according to BLEU metric. We use gradient clipping with a maximum gradient norm of 2. We use the Adam optimizer with a mini-batch size of 32. The learning rate is 0.001.