Submodular Span, with Applications to Conditional Data Summarization
Authors: Lilly Kumari, Jeff Bilmes12344-12352
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide empirical and qualitative results on three real-world tasks: conditional multi-document summarization on the DUC 2005-2007 datasets, conditional video summarization on the UT-Egocentric dataset, and conditional image corpus summarization on the Image Net dataset. We use deep neural networks, specifically a BERT model for text, Alex Net for video frames, and Bi-directional Generative Adversarial Networks (Bi GAN) for Image Net images to help instantiate the submodular functions. The result is a minimally supervised form of conditional summarization that matches or improves over the previous state-of-the-art. |
| Researcher Affiliation | Academia | Lilly Kumari, Jeff Bilmes Department of Electrical & Computer Engineering, University of Washington, Seattle {lkumari, bilmes}@uw.edu |
| Pseudocode | No | The paper describes algorithmic steps and refers to standard algorithms like the greedy algorithm and MMin, but does not present any pseudocode blocks or explicitly labeled algorithms. |
| Open Source Code | No | The paper does not explicitly state that source code for the methodology is available, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We use DUC 2005-2007 datasets which are the benchmark datasets for query-focused MDS, made available by the Document Understanding Conference 1. [footnote: 1https://duc.nist.gov] ... Image Net-1k (Deng et al. 2009) is a large scale image database which contains nearly 1.28 million training images and 50,000 validation images. |
| Dataset Splits | Yes | We use the English uncased variant of the BERT-base model (Devlin et al. 2018) and fine-tune it for the Rouge-2 recall score prediction task using two years of DUC 2005-2007 as the training set. For example, we fine-tune the network on the DUC 2005-2006 datasets in order to extract fixed-size sentence representations for DUC 2007 (which is the test set in this example). We do not use any oracle summarization labels for the test set. ... For DUC-2005, we use DUC-2006 to tune the hyperparameters which include {l, σ, ϵ, r}. Similarly, for DUC-2006 and DUC-2007, we use DUC-2005 as the development set. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | Yes | We use the ROUGE toolkit (Lin 2004)2 which assesses the summary quality by counting the overlapping units such as n-grams, word sequences, and word-pairs between the candidate summary and the reference summaries. We report recall and F-measure corresponding to Rouge-1, Rouge-2 , and Rouge-SU4. [footnote: 2ROUGE version 1.5.5 used with option -n 2 -x -m -2 4 -u -c 95 -r 1000 -f A -p 0.5 -t 0 -d -l 250] |
| Experiment Setup | Yes | For DUC-2005, we use DUC-2006 to tune the hyperparameters which include {l, σ, ϵ, r}. Similarly, for DUC-2006 and DUC-2007, we use DUC-2005 as the development set. ... For Video-1, we use Video-3 to tune the hyperparameters which include {k1, k2}; k1 and k2 are the cardinality constraints for optimizing stage one and stage two respectively. For Video 2-4, we use Video-1 as the development set. ... In all experiments, k is set to 1000. |