SAM-Net: Integrating Event-Level and Chain-Level Attentions to Predict What Happens Next

Authors: Shangwen Lv, Wanhui Qian, Longtao Huang, Jizhong Han, Songlin Hu6802-6809

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiment results on the widely used New York Times corpus demonstrate that our model achieves better results than other state-of-the-art baselines by adopting the evaluation of Multi-Choice Narrative Cloze task.
Researcher Affiliation Academia Shangwen Lv,1,2 Wanhui Qian,1,2 Longtao Huang,1* Jizhong Han,1 Songlin Hu1,2 1Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China 2School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Pseudocode No The paper describes the model verbally and mathematically but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link regarding the public availability of its source code for the described methodology.
Open Datasets Yes Following (Granroth-Wilding and Clark 2016) and (Wang, Zhang, and Chang 2017), we extract events from the NYT portion of the Gigaword corpus (Graff et al. 2003).
Dataset Splits Yes The training set consists of 1.01M event chains. We adopt 10,000 event chains as development set and 10,000 event chains as the test set.
Hardware Specification No The paper does not provide any specific hardware details such as GPU or CPU models, or cloud computing instance specifications used for running the experiments.
Software Dependencies No The paper mentions using GloVe, C&C tools, Open NLP, and Adam Optimizer, but it does not provide specific version numbers for these software dependencies.
Experiment Setup Yes We set batch size to 128 and regularization weight to λ = 10^-5. We adopt Adam Optimizer (Kingma and Ba 2014) to optimize our model and the initial learning rate is set to 10^-4. We adopt the Glove (Pennington, Socher, and Manning 2014) pre-trained word embeddings and the dimension is set to 100. The size of LSTM hidden state is set to 128. The parameters are initialized with Xavier Initialization (Glorot and Bengio 2010).