Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Comparative Document Summarisation via Classification
Authors: Umanga Bista, Alexander Mathews, Minjeong Shin, Aditya Krishna Menon, Lexing Xie20-28
AAAI 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate comparative summarisation methods on a newly curated collection of controversial news topics over 13 months. We observe that gradient-based optimisation outperforms discrete and baseline approaches in 15 out of 24 different automatic evaluation settings. |
| Researcher Affiliation | Academia | Umanga Bista, Alexander Mathews, Minjeong Shin, Aditya Krishna Menon, Lexing Xie Australian National University , Data to Decisions CRC EMAIL |
| Pseudocode | No | The paper describes algorithms but does not provide structured pseudocode or algorithm blocks labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | Code, datasets and a supplementary appendix are available at https://github.com/computationalmedia/compsumm |
| Open Datasets | Yes | Code, datasets and a supplementary appendix are available at https://github.com/computationalmedia/compsumm |
| Dataset Splits | Yes | For each news topic, we generate 10 random splits with 80% training articles and 20% test articles for automatic evaluation. |
| Hardware Specification | No | The paper mentions 'use of the Ne CTAR Research Cloud' but does not specify any particular hardware components like GPU or CPU models, or memory. |
| Software Dependencies | No | The paper mentions tools and algorithms like GloVe-300, L-BFGS, and k-means++ but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | The hyper-parameter γ is chosen along with the trade-offfactor λ, and SVM soft margin C using grid search 3 fold cross-validation on the training set. |