reproducibilityindex.ai

RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems

Authors: Chongyang Tao, Lili Mou, Dongyan Zhao, Rui Yan

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on both retrieval and generative dialog systems show that RUBER has a high correlation with human annotation, and that RUBER has fair transferability over different datasets.
Researcher Affiliation	Academia	1Institute of Computer Science and Technology, Peking University, China 2David R. Cheriton School of Computer Science, University of Waterloo 3Beijing Institute of Big Data Research, China {chongyangtao,zhaody,ruiyan}@pku.edu.cn doublepower.mou@gmail.com
Pseudocode	No	The paper does not include pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets	No	We crawled massive data from an online Chinese forum Douban. The training set contains 1,449,218 samples, each of which consists of a query-reply pair.
Dataset Splits	No	The paper mentions a training set and a set of 300 samples for human evaluation, but it does not provide specific training/validation/test dataset splits for model training or evaluation in the conventional sense (e.g., percentages or counts for a distinct validation set).
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models or types of computing resources used for the experiments.
Software Dependencies	No	The paper mentions 'word2vec embeddings' and 'Adam' optimizer but does not specify their version numbers or the versions of any other software libraries or frameworks used.
Experiment Setup	Yes	In the referenced metric, we trained 50dimensional word2vec embeddings on the Douban dataset. For the unreferenced metric, the dimension of RNN layers was set to 500. The training objective is to minimize J = max 0, Δ s U(q, r) + s U(q, r ) (3) We train model parameters with Adam (Kingma and Ba 2015) with backpropagation. ... margin Δ (set to 0.05 by validation).