Submodular Span, with Applications to Conditional Data Summarization

Authors: Lilly Kumari, Jeff Bilmes12344-12352

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide empirical and qualitative results on three real-world tasks: conditional multi-document summarization on the DUC 2005-2007 datasets, conditional video summarization on the UT-Egocentric dataset, and conditional image corpus summarization on the Image Net dataset. We use deep neural networks, specifically a BERT model for text, Alex Net for video frames, and Bi-directional Generative Adversarial Networks (Bi GAN) for Image Net images to help instantiate the submodular functions. The result is a minimally supervised form of conditional summarization that matches or improves over the previous state-of-the-art.
Researcher Affiliation Academia Lilly Kumari, Jeff Bilmes Department of Electrical & Computer Engineering, University of Washington, Seattle {lkumari, bilmes}@uw.edu
Pseudocode No The paper describes algorithmic steps and refers to standard algorithms like the greedy algorithm and MMin, but does not present any pseudocode blocks or explicitly labeled algorithms.
Open Source Code No The paper does not explicitly state that source code for the methodology is available, nor does it provide a link to a code repository.
Open Datasets Yes We use DUC 2005-2007 datasets which are the benchmark datasets for query-focused MDS, made available by the Document Understanding Conference 1. [footnote: 1https://duc.nist.gov] ... Image Net-1k (Deng et al. 2009) is a large scale image database which contains nearly 1.28 million training images and 50,000 validation images.
Dataset Splits Yes We use the English uncased variant of the BERT-base model (Devlin et al. 2018) and fine-tune it for the Rouge-2 recall score prediction task using two years of DUC 2005-2007 as the training set. For example, we fine-tune the network on the DUC 2005-2006 datasets in order to extract fixed-size sentence representations for DUC 2007 (which is the test set in this example). We do not use any oracle summarization labels for the test set. ... For DUC-2005, we use DUC-2006 to tune the hyperparameters which include {l, σ, ϵ, r}. Similarly, for DUC-2006 and DUC-2007, we use DUC-2005 as the development set.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies Yes We use the ROUGE toolkit (Lin 2004)2 which assesses the summary quality by counting the overlapping units such as n-grams, word sequences, and word-pairs between the candidate summary and the reference summaries. We report recall and F-measure corresponding to Rouge-1, Rouge-2 , and Rouge-SU4. [footnote: 2ROUGE version 1.5.5 used with option -n 2 -x -m -2 4 -u -c 95 -r 1000 -f A -p 0.5 -t 0 -d -l 250]
Experiment Setup Yes For DUC-2005, we use DUC-2006 to tune the hyperparameters which include {l, σ, ϵ, r}. Similarly, for DUC-2006 and DUC-2007, we use DUC-2005 as the development set. ... For Video-1, we use Video-3 to tune the hyperparameters which include {k1, k2}; k1 and k2 are the cardinality constraints for optimizing stage one and stage two respectively. For Video 2-4, we use Video-1 as the development set. ... In all experiments, k is set to 1000.