reproducibilityindex.ai

Curriculum Learning for Natural Answer Generation

Authors: Cao Liu, Shizhu He, Kang Liu, Jun Zhao

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that CL-NAG outperforms the state-of-the-art, which increases 6.8% and 8.7% in the accuracy for simple and complex questions, respectively. 4 Experiments
Researcher Affiliation	Academia	1 National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China 2 University of Chinese Academy of Sciences, Beijing, 100049, China
Pseudocode	No	The paper describes the methodology in text and a diagram (Figure 2) but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions a third-party tool's GitHub link in a footnote ("WBMs are implemented in https://github.com/Maluuba/nlgeval.") but does not provide concrete access or an explicit statement about the availability of its own source code for the described methodology.
Open Datasets	Yes	Experimental data is an open real-world CQA dataset, which is from COREQA [He et al., 2017].
Dataset Splits	No	The paper states: "The dataset is divided into simple-QA and complex-QA according to the number of matched knowledge facts, in which simple-QA only matches one grounded fact, and the complex-QA contains multiple grounded facts." It mentions "training data" but does not provide specific percentages or counts for training, validation, and test splits.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions tools and frameworks like "Stanford Parser" and "jieba6 toolkit" but does not provide specific version numbers for any software dependencies needed to replicate the experiment.
Experiment Setup	Yes	For purpose of comparison, we design experimental settings as follows. ... FP: Common and target instances are combined by a ﬁxed proportion (we set it to 0.5). ... a minimum threshold for the low term frequency (e.g. 10) could be used to ﬁlter such noise. ... a proportion (e.g. 0.5) of short and long answers is set to choose fewer short answers