Semantic Adversarial Network with Multi-Scale Pyramid Attention for Video Classification
Authors: De Xie, Cheng Deng, Hao Wang, Chao Li, Dapeng Tao9030-9037
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on two public benchmarks demonstrate our proposed methods achieves state-of-the-art results on standard video datasets. |
| Researcher Affiliation | Academia | 1School of Electronic Engineering, Xidian University, Xi an 710071, China 2School of Information Science and Engineering, Yunnan University, Kunming 650091, China |
| Pseudocode | Yes | Algorithm 1 The optimization algorithm of SAL |
| Open Source Code | No | The paper does not provide any explicit statements about open-source code availability or links to repositories. |
| Open Datasets | Yes | UCF101 (Soomro, Zamir, and Shah 2012) and HMDB51 (Kuehne et al. 2011). |
| Dataset Splits | Yes | We follow the officially offered scheme which divides dataset into 3 training and testing splits and finally report the average accuracy over the three splits. |
| Hardware Specification | Yes | We train our model with 4 NVIDIA TITAN X GPUs |
| Software Dependencies | No | all the experiments are implemented under the Pytorch. |
| Experiment Setup | Yes | The mini-batch size is set to 64 and the momentum is set to 0.9. The initial learning rate is set to 0.001 for Spatial Network and Temporal Network and decreases by 0.1 every 40 epochs. For adversarial training procedure, we use adaptive moment estimation algorithm to train D Network and the initial learning rate is set to 0.0001. The training procedure of Spatial Network and Temporal Network stops after 80 epochs and 120 epochs respectively. We use gradient clipping of 20 and 40 for Spatial and Temporal training procedure to avoid gradient explosion. |