Incorporating Structured Commonsense Knowledge in Story Completion
Authors: Jiaao Chen, Jianshu Chen, Zhou Yu6244-6251
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our model outperforms state-of-the-art approaches on a public dataset, ROCStory Cloze Task (Mostafazadeh et al. 2017), and the performance gain from adding the additional commonsense knowledge is significant. [...] We evaluated our model on ROCStories (Mostafazadeh et al. 2017), a publicly available collection of commonsense short stories. [...] We evaluated baselines and our model using accuracy as the metric on the ROCStories dataset, and summarized these results in Table 2. [...] We conducted another two groups of experiments to investigate the contribution of the three different types of information: narrative sequence, sentiment evolution and commonsense knowledge. |
| Researcher Affiliation | Collaboration | Jiaao Chen, 1 Jianshu Chen,2 Zhou Yu3 1Zhejiang University, 2Tencent AI Lab 3University of California, Davis 3150105589@zju.edu.cn, jianshuchen@tencent.com,joyu@ucdavis.edu |
| Pseudocode | Yes | Algorithm 1 Knowledge distance computation |
| Open Source Code | No | The paper mentions "pre-trained parameters released by Open AI 1" with a link "https://github.com/openai/finetune-transformer-lm" (footnote 1). However, this is a third-party resource used by the authors, not the open-source code for the methodology described in *this* paper. |
| Open Datasets | Yes | We evaluated our model on ROCStories (Mostafazadeh et al. 2017), a publicly available collection of commonsense short stories. [...] The published ROCStories dataset 2 is constructed with ROCStories as a training set that includes 98,162 stories that exclude candidate wrong endings, an evaluation set, and a test set, which have the same structure (1 body + 2 candidate endings) and a size of 1,871. [...] 2http://cs.rochester.edu/nlp/rocstories |
| Dataset Splits | Yes | For learning to select the right ending, we randomly split 80% of stories with two candidates endings in ROCStories evaluation set as our training set (1,479 cases), 20% of stories in ROCStories evaluation set as our validation set (374 cases). |
| Hardware Specification | No | The paper does not explicitly mention any specific hardware (e.g., GPU model, CPU type, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software components like "NLTK and Standford s Core NLP tools (Manning et al. 2014)", "VADER (Hutto and Gilbert 2014)", and states that "Adam to train all parameters", but does not specify version numbers for these or for Python, which is implied as the implementation language. |
| Experiment Setup | Yes | Specifically, we set the dimension of LSTM for sentiment prediction to 64. We use a mini-batch size of 8, and Adam to train all parameters. The learning rate is set to 0.001 initially with a decay rate of 0.5 per epoch. |