Deep Supervised Summarization: Algorithm and Application to Learning Instructions

Authors: Chengguang Xu, Ehsan Elhamifar

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental By experiments on the problem of learning key-steps (subactivities) of instructional videos, we show that our proposed framework improves the state-of-the-art supervised subset selection algorithms.
Researcher Affiliation Academia Chengguang Xu Khoury College of Computer Sciences Northeastern University Boston, MA 02115 xu.cheng@husky.neu.edu Ehsan Elhamifar Khoury College of Computer Sciences Northeastern University Boston, MA 02115 eelhami@ccs.neu.edu
Pseudocode Yes Algorithm 1 : Supervised Facility Location Learning
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We perform experiments on Proce L [1] and Breakfast [58] datasets.
Dataset Splits Yes For the experiments on Proce L, we split the videos of each task into 70% for training, 15% for validation and 15% for testing. For the Breakfast, we split the videos of each activity into 60% for training, 20% for validation, and 20% for testing.
Hardware Specification No The paper mentions using a C3D network and Pytorch for implementation, but it does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No We implemented our framework in Pytorch and used the ADMM framework in [12] for subset selection via UFL and our Sup FL. The paper mentions software names but does not provide specific version numbers for reproducibility.
Experiment Setup Yes For Sup FL(L), we set the dimension of the transformed data to 1000 and 500 for Proce L and Breakfast, respectively, while for Sup FL(N) we set the dimension of the network to 4096 1000 1000 and 4096 1000 500 for Proce L and Breakfast, respectively, where we use Re Lu activations for the second layer. We use stochastic gradient descent to train our model and use 5 videos in each minibatch. We use the Adam optimizer with the learning rate of 1e-4 and weight decay of 5e-4. We train our model for at most 50 epochs. In order to improve the training time, after we compute assignments of points to each representative in our alternating algorithm, we randomly sample 10 points from each group and use them to form the loss functions in (6).