Real-Time Web Scale Event Summarization Using Sequential Decision Making
Authors: Chris Kedzie, Fernando Diaz, Kathleen McKeown
IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate a 28.3% improvement in summary F1 and a 43.8% improvement in time-sensitive F1 metrics. |
| Researcher Affiliation | Collaboration | Chris Kedzie Columbia University Dept. of Computer Science kedzie@cs.columbia.edu Fernando Diaz Microsoft Research fdiaz@microsoft.com Kathleen Mc Keown Columbia University Dept. of Computer Science kathy@cs.columbia.edu |
| Pseudocode | Yes | Algorithm 1: Locally optimal learning to search. |
| Open Source Code | Yes | source code is available at: https://github.com/kedz/ijcai2016 |
| Open Datasets | Yes | We evaluate our method on the publicly available TREC Temporal Summarization Track data.2 This data is comprised of three parts. ... 2http://www.trec-ts.org/ |
| Dataset Splits | Yes | To evaluate our model, we randomly select five events to use as a development set and then perform a leave-one-out style evaluation on the remaining 39 events. In order to avoid over-fitting, we select the model iteration for each training fold based on its performance (in F1 score of expected gain and comprehensiveness) on the development set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments. |
| Software Dependencies | No | The paper mentions 'WordNet' and 'python-goose' but does not provide version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | we downsample each stream to a length of 100 sentences. The downsampling is done uniformly over the entire stream. This is repeated 10 times for each training event to create a total of 380 training streams. The development set was used to set the threshold. The time window size, similarity threshold, and an offset for the cluster preference are tuned on the development set. |