Video Summarization via Semantic Attended Networks

Authors: Huawei Wei, Bingbing Ni, Yichao Yan, Huanyu Yu, Xiaokang Yang, Chen Yao

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that our method achieves a superior performance gain over previous methods on two benchmark datasets.
Researcher Affiliation Academia Huawei Wei, Bingbing Ni, Yichao Yan, Huanyu Yu, Xiaokang Yang Shanghai Key Laboratory of Digital Media Processing and Transmission, Shanghai Jiao Tong University weihuawei26@gmail.com,{nibingbing,yanyichao,yuhuanyu,xkyang}@sjtu.edu.cn
Pseudocode Yes Algorithm 1 Training semantic attended network
Open Source Code No The paper does not provide an explicit statement about releasing the source code for the described methodology or a link to a code repository.
Open Datasets Yes We evaluate our approach on two video datasets, Sum Me (Gygli et al. 2014) and TVSum (Song et al. 2015) annotated with text descriptions created by us.
Dataset Splits Yes For each benchmark, We randomly select 80% for training and the remaining 20% for testing.
Hardware Specification Yes All experiments are conducted on the GTX TITAN X GPU using Tensorflow (Abadi et al. 2016).
Software Dependencies No The paper mentions 'Tensorflow (Abadi et al. 2016)' but does not specify a version number for it or any other software dependencies.
Experiment Setup Yes We train our networks with Adam optimizer with initial learning rate 0.0001. All experiments are conducted on the GTX TITAN X GPU using Tensorflow (Abadi et al. 2016). Both the frame selector and the decoder of the video description model are a single-layer LSTM network, and the encoder of the video description model is a bidirectional LSTM work; all these LSTM networks include 1024 hidden units.