Active Video Summarization: Customized Summaries via On-line Interaction with the User
Authors: Ana Garcia del Molino, Xavier Boix, Joo-Hwee Lim, Ah-Hwee Tan
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate AVS in the commonly used UTEgo dataset. We also introduce a new dataset for customized video summarization (CSumm) recorded with a Google Glass. The results show that AVS achieves an excellent compromise between usability and quality. In 41% of the videos, AVS is considered the best over all tested baselines, including summaries manually generated. |
| Researcher Affiliation | Academia | Ana Garcia del Molino, Xavier Boix, Joo-Hwee Lim, Ah-Hwee Tan Institute for Infocomm Research, A*STAR, Singapore School of Computer Science and Engineering, Nanyang Technological University, Singapore LCSL, Massachusetts Institute of Technology and Istituto Italiano di Tecnologia, MA Center for Brains, Minds, and Machines, Mc Govern Institute for Brain Research, Massachusetts Institute of Technology, MA {stugdma,joohwee}@i2r.a-star.edu.sg, xboix@mit.edu, asahtan@ntu.edu.sg |
| Pseudocode | Yes | Alg. 1: Active Summarization |
| Open Source Code | No | The paper does not provide any concrete access information for its source code, nor does it explicitly state that the code will be made open source or is available in supplementary materials. |
| Open Datasets | Yes | We evaluate AVS on two challenging datasets for video summarization: UTEgo (Lee, Ghosh, and Grauman 2012), which is a commonly used egocentric video dataset, and CSumm, a new dataset for customizable video summarization that we introduce. CSumm contains single-shot unconstrained videos of long duration recorded with a Google Glass |
| Dataset Splits | No | The paper does not provide specific details on training/validation/test splits, such as percentages, sample counts, or explicit citation to predefined splits for reproducibility. It describes general evaluation methodologies like user studies but not data partitioning for model training/validation. |
| Hardware Specification | No | The paper mentions 'Google Glass' and 'Looxcie camera' as recording devices and the use of a 'neural network' (Alex Net) which implies computational hardware, but it does not specify any particular GPU models, CPU models, or other hardware used for running the experiments or training the models. |
| Software Dependencies | No | The paper mentions using 'Alex Net' and a 'Belief Propagation (BP)' implementation, but it does not provide specific version numbers for any software libraries, frameworks, or solvers used in the experiments. |
| Experiment Setup | Yes | We set α = 5, β = 1 and γ = 1. ... In the active interaction phase, the multiplier K is set to 5. ... Qi is increased or decreased by Δ, which is set to 100 to ensure that the segments selected by the user appear in the summary, and the discarded do not. ... The duration is variable depending on the length of the original video. It is set to be around 0.1% of the video length, with a minimum of 10 seconds. |