reproducibilityindex.ai

Hierarchical Recurrent Attention Network for Response Generation

Authors: Chen Xing, Yu Wu, Wei Wu, Yalou Huang, Ming Zhou

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical studies on both automatic evaluation and human judgment show that HRAN can signiﬁcantly outperform state-of-the-art models for context based response generation.
Researcher Affiliation	Collaboration	Chen Xing,12 Yu Wu,3 Wei Wu,4 Yalou Huang,12 Ming Zhou4 1College of Computer and Control Engineering, Nankai University, Tianjin, China 2College of Software, Nankai University, Tianjin, China 3State Key Lab of Software Development Environment, Beihang University, Beijing, China 4 Microsoft Research, Beijing, China
Pseudocode	No	The paper describes the model architecture and mathematical formulations but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	We release our source code and data at https://github.com/Lynette Xing1991/HRAN.
Open Datasets	Yes	We built a data set from Douban Group [...]. The data will be publicly available.
Dataset Splits	Yes	From them, we randomly sampled 1 million conversations as training data, 10, 000 conversations as validation data, and 1, 000 conversations as test data, and made sure that there is no overlap among them.
Hardware Specification	Yes	All models were initialized with isotropic Gaussian distributions X N(0, 0.01) and trained with an Ada Delta algorithm (Zeiler 2012) on a NVIDIA Tesla K40 GPU.
Software Dependencies	No	The paper mentions using an 'Ada Delta algorithm' and 'Blocks4' with a GitHub link, but does not specify version numbers for Blocks or other software libraries.
Experiment Setup	Yes	In all models, we set the dimensionality of hidden states of encoders and decoders as 1000, and the dimensionality of word embedding as 620. All models were initialized with isotropic Gaussian distributions X N(0, 0.01) and trained with an Ada Delta algorithm (Zeiler 2012) on a NVIDIA Tesla K40 GPU. The batch size is 128. We set the initial learning rate as 1.0 and reduced it by half if the perplexity on validation began to increase.