reproducibilityindex.ai

Memory-Augmented Temporal Dynamic Learning for Action Recognition

Authors: Yuan Yuan, Dong Wang, Qi Wang9167-9175

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate this end-to-end system on benchmark datasets (UCF101 and HMDB51) of human action recognition. The experimental results show consistent improvements on both datasets over prior works and our baselines.
Researcher Affiliation	Academia	Yuan Yuan, Dong Wang, Qi Wang School of Computer Science and Center for OPTical IMagery Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi an 710072, China, {y.yuan1.ieee, nwpuwangdong, crabwq}@gmail.com
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link regarding the availability of its source code.
Open Datasets	Yes	Experiments are mainly conducted on two action recognition benchmark datasets: UCF101 (Soomro, Zamir, and Shah 2012) and HMDB51 (Kuehne et al. 2011).
Dataset Splits	No	The paper states: 'For both of them, we follow the provided evaluation protocol and adopt standard training/test splits and report the mean classiﬁcation accuracy over these splits.' This refers to standard splits but does not provide specific details (percentages or counts) for reproducibility within the paper itself.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments. It only mentions that 'All the experiments are run on the Py Torch toolbox'.
Software Dependencies	No	The paper states: 'All the experiments are run on the Py Torch toolbox (Paszke et al. 2017).' It mentions PyTorch but does not specify its version number or any other software dependencies with their versions.
Experiment Setup	Yes	The mini-batch size is set to 64 and the momentum is set to 0.9. We use small learning rate in our experiments. For spatial-stream networks, the learning rate is initialized as 0.001 and decrease by 1/10 every 6,000 iterations. The training procedure stops after 18,000 iterations. For the temporal stream, we initialize the learning rate as 0.005, which reduces to its 1/10 after 48,000 and 72,000 iterations. The maximum iteration is set as 80,000. We use gradient clipping of 20 to avoid exploding gradient at the early stage.