Action Recognition With Coarse-to-Fine Deep Feature Integration and Asynchronous Fusion
Authors: Weiyao Lin, Chongyang Zhang, Ke Lu, Bin Sheng, Jianxin Wu, Bingbing Ni, Xin Liu, Hongkai Xiong
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on action recognition benchmarks demonstrate that our approach achieves the state-of-the-art performance. |
| Researcher Affiliation | Academia | 1 Department of Electronic Engineering, Shanghai Jiao Tong University, China 2 National Key Laboratory for Novel Software Technology, Nanjing University, China 3 University of Chinese Academy of Sciences, China |
| Pseudocode | No | The paper describes methods using prose and mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | We perform experiments on two benchmark datasets: UCF101 (Soomro, Zamir, and Shah 2012) and HMDB51 (Kuehne et al. 2011). |
| Dataset Splits | Yes | Table 1 compares the action recognition results on split 1 of UCF101 and HMDB51 datasets... Note that in this experiment, we adopt three training/testing splits on both datasets in order to have a fair comparison with other methods. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models used for running the experiments. |
| Software Dependencies | No | The paper states 'We implement our approach on Caffe (Jia et al. 2014)' but does not provide specific version numbers for Caffe or other software dependencies. |
| Experiment Setup | Yes | The batch size and momentum are set to be 16 and 0.9, respectively. The weight parameters for different granularities in the coarse-to-fine network (α1, α2, α3 in Eq. 1) are set to be 0.1, 0.1, 1 respectively. Besides, the weight parameters for the LSTM models in the coarse-to-fine and asynchronous fusion networks (β and γ in Eqs. 3 and 5) are set as 2 to let the networks focus more on the reliability on their final outputs. When training the entire framework, we set the initial learning rate as 10 2 and is decreased to its 1/10 for every 20K iterations. The maximum iteration is 100K. |