reproducibilityindex.ai

Conclusion-Supplement Answer Generation for Non-Factoid Questions

Authors: Makoto Nakatsuji, Sohei Okui8520-8527

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Evaluations conducted on datasets including Love Advice and Arts & Humanities categories indicate that our model outputs much more accurate results than the tested baseline models do.
Researcher Affiliation	Industry	Makoto Nakatsuji, Sohei Okui NTT Resonant Inc. Granparktower, 3-4-1 Shibaura, Minato-ku, Tokyo 108-0023, Japan nakatsuji.makoto@gmail.com, okui@nttr.co.jp
Pseudocode	No	The paper describes the model architecture and equations but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or link for the open-source code of the described methodology.
Open Datasets	Yes	We also used the Yahoo nf L6 dataset, the largest publicly available English non-factoid CQA dataset. It has 499,078 answers to 87,361 questions. 1https://ciir.cs.umass.edu/downloads/nf L6/
Dataset Splits	No	The paper describes training and testing sets ('10,032 question-conclusion-supplement (q-c-s) triples' for training, 'three different sets of 500 human-annotated test pairs' for testing), and mentions 'randomly shuffling the train/test sets', but does not explicitly provide details for a separate validation split with specific numbers or percentages.
Hardware Specification	No	The paper does not provide specific details regarding the hardware used for running the experiments (e.g., GPU/CPU models, memory specifications).
Software Dependencies	No	The paper does not provide specific version numbers for ancillary software dependencies such as programming languages, libraries, or frameworks (e.g., Python version, PyTorch version).
Experiment Setup	Yes	For both datasets, we tried different parameter values and set the size of the bigram token embedding to 500, the size of LSTM output vectors for the Bi LSTMs to 500 2, and number of topics in the CLSTM model to 15. We tried different margins, M, in the hinge loss function and settled on 0.2. The iteration count N was set to 100. We varied α in Eq. (2) from 0 to 2.0 and checked the impact of Ls by changing α.