RefNet: A Reference-Aware Network for Background Based Conversation
Authors: Chuan Meng, Pengjie Ren, Zhumin Chen, Christof Monz, Jun Ma, Maarten de Rijke8496-8503
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that Ref Net significantly outperforms state-of-the-art methods in terms of both automatic and human evaluations, indicating that Ref Net can generate more appropriate and human-like responses. |
| Researcher Affiliation | Academia | Chuan Meng,1 Pengjie Ren,2 Zhumin Chen,1 Christof Monz,2 Jun Ma,1 Maarten de Rijke2 1Shandong University, Qingdao, China, 2University of Amsterdam, Amsterdam, The Netherlands |
| Pseudocode | No | The paper does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available online.1 1https://github.com/Chuan Meng/Ref Net |
| Open Datasets | Yes | We choose the Holl-E dataset released by Moghe et al. (2018) because it contains boundary annotations of the background information used for each response. |
| Dataset Splits | Yes | We follow the original data split for training, validation and test. There are also two versions of the test set: one with single golden reference (SR) and the other with multiple golden references (MR); see (Moghe et al. 2018). |
| Hardware Specification | No | The paper does not provide any specific hardware details like GPU models, CPU types, or memory. |
| Software Dependencies | No | The paper mentions optimizers (Adam) and network architectures (GRU) but does not list specific software libraries or their version numbers (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | We set the word embedding size and GRU hidden state size to 128 and 256, respectively. The vocabulary size is limited to 25,000. For fair comparison, all models use the same embedding size, hidden state size and vocabulary size. Following Moghe et al. (2018), we limit the context length of all models to 65. We train all models for 30 epochs and test on a validation set after each epoch, and select the best model based on the validation results according to BLEU metric. We use gradient clipping with a maximum gradient norm of 2. We use the Adam optimizer with a mini-batch size of 32. The learning rate is 0.001. |