Semantic Adversarial Network with Multi-Scale Pyramid Attention for Video Classification

Authors: De Xie, Cheng Deng, Hao Wang, Chao Li, Dapeng Tao9030-9037

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on two public benchmarks demonstrate our proposed methods achieves state-of-the-art results on standard video datasets.
Researcher Affiliation Academia 1School of Electronic Engineering, Xidian University, Xi an 710071, China 2School of Information Science and Engineering, Yunnan University, Kunming 650091, China
Pseudocode Yes Algorithm 1 The optimization algorithm of SAL
Open Source Code No The paper does not provide any explicit statements about open-source code availability or links to repositories.
Open Datasets Yes UCF101 (Soomro, Zamir, and Shah 2012) and HMDB51 (Kuehne et al. 2011).
Dataset Splits Yes We follow the officially offered scheme which divides dataset into 3 training and testing splits and finally report the average accuracy over the three splits.
Hardware Specification Yes We train our model with 4 NVIDIA TITAN X GPUs
Software Dependencies No all the experiments are implemented under the Pytorch.
Experiment Setup Yes The mini-batch size is set to 64 and the momentum is set to 0.9. The initial learning rate is set to 0.001 for Spatial Network and Temporal Network and decreases by 0.1 every 40 epochs. For adversarial training procedure, we use adaptive moment estimation algorithm to train D Network and the initial learning rate is set to 0.0001. The training procedure of Spatial Network and Temporal Network stops after 80 epochs and 120 epochs respectively. We use gradient clipping of 20 and 40 for Spatial and Temporal training procedure to avoid gradient explosion.