reproducibilityindex.ai

Graph Attention Based Proposal 3D ConvNets for Action Detection

Authors: Jin Li, Xianglong Liu, Zhuofan Zong, Wanru Zhao, Mingyuan Zhang, Jingkuan Song4626-4633

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on two proposal 3D Conv Nets based models (P-C3D and P-Res Net) and two popular action detection benchmarks (THUMOS 2014, Activity Net v1.3) demonstrate the state-of-the-art performance achieved by our method. Particularly, P-C3D embedded with our module achieves average m AP 3.7% improvement on THUMOS 2014 dataset compared to original model. Comparison with State-of-the-art Methods Ablation Study
Researcher Affiliation	Academia	Jun Li,1 Xianglong Liu,1,2 Zhuofan Zong,1 Wanru Zhao,1 Mingyuan Zhang,1 Jingkuan Song3 1State Key Lab of Software Development Environment, Beihang University, Beijing, China 2Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University, Beijing, China 3Innovation Center, University of Electronic Science and Technology of China, Chengdu, China
Pseudocode	No	The paper describes methods and processes in narrative text and figures, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper references a GitHub link (1https://github.com/sunnyxiaohu/R-C3D.pytorch) in a footnote, stating: "We implement our graph attention module with framewise constraint mainly on R-C3D1 model, which is written in pytorch." However, this link points to the baseline R-C3D model and not the authors' specific modifications or proposed AGCN module, nor does it explicitly state their code is open-source.
Open Datasets	Yes	THUMOS 2014 (Jiang et al. 2014) Activity Net v1.3 (Fabian Caba Heilbron and Niebles 2015)
Dataset Splits	Yes	THUMOS 2014... It includes 2765 trimmed videos of these 20 actions in UCF101 for training, 200 and 213 untrimmed videos with temporal annotations for the validation and the test sets respectively. Activity Net v1.3... It is divided into training, validation and test sets with ratio 2:1:1. It has 10024, 4926 and 5044 videos for training, validation and test sets.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory configurations used for running the experiments.
Software Dependencies	No	The paper mentions "pytorch" as the framework used but does not specify its version or any other software dependencies with version numbers.
Experiment Setup	Yes	For both models and datasets, we decompress the videos into frames at 25 frames per second (fps), and create a buffer of 768 frames. For THUMOS 2014, the learning rate is kept ﬁxed at 10 4 for ﬁrst 3 epochs and is decreased to 10 5 for the last 2 epochs. We choose 10 anchor segments with speciﬁc scale values [2, 4, 5, 6, 8, 9, 10, 12, 14, 16]. We use Sports-1M pretrained model to initialize the training. For Activity Net v1.3, the learning rate is still kept ﬁxed at 10 4 for ﬁrst 6 epochs and is decreased to 10 5 for the last 2 epochs on Activity Net v1.3. We choose 37 anchor segments with speciﬁc scale values [1, 1.25, 1.5, 1.75, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 18, 20, 22, 24, 28, 32, 36, 40, 44, 52, 60, 68, 76, 84, 92, 100]. The learning rate of our module is 10 times larger than basic model for both datasets. For THUMOS 2014, the learning rate is kept ﬁxed at 10 4 for ﬁrst 4 epochs and is decreased to 10 5 for the last 2 epochs. We use UCF-101 pretrained model to initialize the training.