reproducibilityindex.ai

Discriminative Sentence Modeling for Story Ending Prediction

Authors: Yiming Cui, Wanxiang Che, Wei-Nan Zhang, Ting Liu, Shijin Wang, Guoping Hu7602-7609

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on the Story Cloze Test dataset show that the proposed model siginiﬁcantly outperforms various systems by a large margin, and detailed ablation studies are given for better understanding our model. Experiments Experimental Setups We evaluate our approach on the Story Cloze Test dataset (Mostafazadeh et al. 2016), which consists of 1,871 validation and test set, respectively.
Researcher Affiliation	Collaboration	Yiming Cui,1,2 Wanxiang Che,1 Wei-Nan Zhang,1 Ting Liu,1 Shijin Wang,2,3 Guoping Hu2 1Research Center for Social Computing and Information Retrieval (SCIR), Harbin Institute of Technology, Harbin, China 2State Key Laboratory of Cognitive Intelligence, i FLYTEK Research, China 3i FLYTEK AI Research (Hebei), Langfang, China {ymcui, car, wnzhang, tliu}@ir.hit.edu.cn {ymcui, sjwang3, gphu}@iﬂytek.com
Pseudocode	Yes	Algorithm 1 Modiﬁed Attention-over-Attention. Input: Time-Distributed representation TD1 Time-Distributed representation TD2 Transformation Function η(x), default None Output: TD1-aware TD2 representation TD
Open Source Code	No	The paper states 'Our implementation is based on Keras (Chollet 2015) and Tensor Flow (Abadi et al. 2016)', but it does not provide an explicit statement about releasing its own source code or a link to a repository for its methodology.
Open Datasets	Yes	We evaluate our approach on the Story Cloze Test dataset (Mostafazadeh et al. 2016), which consists of 1,871 validation and test set, respectively. Following previous works (Cai, Tu, and Gimpel 2017; Chaturvedi, Peng, and Roth 2017; Schwartz et al. 2017), we take the validation set for training and evaluate the performance on the test set.
Dataset Splits	Yes	We evaluate our approach on the Story Cloze Test dataset (Mostafazadeh et al. 2016), which consists of 1,871 validation and test set, respectively. Following previous works (Cai, Tu, and Gimpel 2017; Chaturvedi, Peng, and Roth 2017; Schwartz et al. 2017), we take the validation set for training and evaluate the performance on the test set. Similar to v1.0 settings, we use the development set (1,571 samples) for training.
Hardware Specification	Yes	Traditional neural models are trained on NVIDIA Tesla V100 GPU. BERT related models are trained on a single TPU v2, which has 64G HBM.
Software Dependencies	No	The paper mentions software like Keras (Chollet 2015), Tensor Flow (Abadi et al. 2016), and NLTK toolkit (Bird and Loper 2004) but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	Embedding Layer: The embedding weights are initialized by the pre-trained Glo Ve vectors (840B version, 300 dimension)... Hidden Layer: We use Bi-LSTM with 200 dimension for each direction... Regularization: We apply l2-regularization of 0.001 on the embedding weights and a dropout...rate of 0.5... Optimization: We use ADAM for weight updating...with an initial learning rate of 0.001, and decaying learning rate at each epoch by a factor of 0.8. Also we clipped the l2-norm of gradient to 5... The batch size is set to 32.