reproducibilityindex.ai

From Neural Sentence Summarization to Headline Generation: A Coarse-to-Fine Approach

Authors: Jiwei Tan, Xiaojun Wan, Jianguo Xiao

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on a large real dataset demonstrate the proposed approach signiﬁcantly improves the performance of neural sentence summarization models on the headline generation task. We conduct experiments on the New York Times news corpus.
Researcher Affiliation	Academia	Jiwei Tan and Xiaojun Wan and Jianguo Xiao Institute of Computer Science and Technology, Peking University The MOE Key Laboratory of Computational Linguistics, Peking University {tanjiwei, wanxiaojun, xiaojianguo}@pku.edu.cn
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper references third-party open-source tools (sumy, Theano, GloVe) but does not provide a link or statement for the authors' own implementation code for their proposed method.
Open Datasets	Yes	Previous sentence summarization models are evaluated on news articles from the English Gigaword corpus1, and only the lead sentences which have signiﬁcant overlap with the headlines are selected. In this paper, we conduct experiments on the 1.4 million NYT articles. We train our model on the same Gigaword dataset used in [Rush et al., 2015; Chopra et al., 2016].
Dataset Splits	No	The paper mentions an 'early stopping strategy, which stops training if the performance no longer improves on held-out training data in 20 epochs,' implying a validation set. However, it does not specify the size or percentage of this held-out data as a distinct split for reproduction purposes.
Hardware Specification	Yes	We run the model on a GTX-1080 GPU card, and it takes about one day for every 100 epochs.
Software Dependencies	No	The paper mentions 'a Python toolkit sumy' and 'the multi-sentence summarization model with theano'. While software is named, specific version numbers for these dependencies are not provided.
Experiment Setup	Yes	For the summary encoder we use three hidden layers of LSTM, and for the control layer we use one layer of LSTM, and each layer has 512 hidden units. The dimension of word vectors is 100. The learning rate of RMSProp is 0.01 and the decay and momentum are both 0.9. We use a batch size of 64 samples, and process 30,016 samples an epoch.