Memory-Augmented Temporal Dynamic Learning for Action Recognition
Authors: Yuan Yuan, Dong Wang, Qi Wang9167-9175
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate this end-to-end system on benchmark datasets (UCF101 and HMDB51) of human action recognition. The experimental results show consistent improvements on both datasets over prior works and our baselines. |
| Researcher Affiliation | Academia | Yuan Yuan, Dong Wang, Qi Wang School of Computer Science and Center for OPTical IMagery Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi an 710072, China, {y.yuan1.ieee, nwpuwangdong, crabwq}@gmail.com |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of its source code. |
| Open Datasets | Yes | Experiments are mainly conducted on two action recognition benchmark datasets: UCF101 (Soomro, Zamir, and Shah 2012) and HMDB51 (Kuehne et al. 2011). |
| Dataset Splits | No | The paper states: 'For both of them, we follow the provided evaluation protocol and adopt standard training/test splits and report the mean classification accuracy over these splits.' This refers to standard splits but does not provide specific details (percentages or counts) for reproducibility within the paper itself. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments. It only mentions that 'All the experiments are run on the Py Torch toolbox'. |
| Software Dependencies | No | The paper states: 'All the experiments are run on the Py Torch toolbox (Paszke et al. 2017).' It mentions PyTorch but does not specify its version number or any other software dependencies with their versions. |
| Experiment Setup | Yes | The mini-batch size is set to 64 and the momentum is set to 0.9. We use small learning rate in our experiments. For spatial-stream networks, the learning rate is initialized as 0.001 and decrease by 1/10 every 6,000 iterations. The training procedure stops after 18,000 iterations. For the temporal stream, we initialize the learning rate as 0.005, which reduces to its 1/10 after 48,000 and 72,000 iterations. The maximum iteration is set as 80,000. We use gradient clipping of 20 to avoid exploding gradient at the early stage. |