Unsupervised Storyline Extraction from News Articles
Authors: Deyu Zhou, Haiyang Xu, Xin-Yu Dai, Yulan He
IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The proposed model has been evaluated on three news corpora and the experimental results show that it outperforms several baseline approaches. |
| Researcher Affiliation | Academia | School of Computer Science and Engineering, Southeast University, China State Key Laboratory for Novel Software Technology, Nanjing University, China School of Engineering and Applied Science, Aston University, UK |
| Pseudocode | No | The paper describes the generative process and inference steps in text but does not include structured pseudocode or an algorithm block. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We crawled and parsed the GDELT Event Database (http://data.gdeltproject.org/events/index.html) containing news articles published in the month of May in 2014. |
| Dataset Splits | No | The paper describes the datasets used (Dataset I, II, and III) but does not provide specific training, validation, and test splits with percentages, counts, or a clear methodology for reproducible data partitioning. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions using the 'Stanford Named Entity Recognizer' and 'light LDA' but does not specify version numbers for these or any other ancillary software dependencies, which would be required for reproducibility. |
| Experiment Setup | Yes | The hyperparameters of the model are set = 1, λ = 0.5, t bg = 0.1, t bg = 0.01, t z = 0.7(s 2 1..St, t 2 1..T) in our experiment. For SDM, the storyline number is set to 100 on Dataset II and 30 on Dataset III. The topic number is set to 100 on Dataset II and 20 on Dataset III. The number of historical epochs M, which is taken into account for setting the Dirichlet priors for the storyline-keyword, storyline-location, storyline-person, storyline-organization distributions, is set to 7, the same as in our proposed approach. |