Dynamic Normalization and Relay for Video Action Recognition

Authors: Dongqi Cai, Anbang Yao, Yurong Chen

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that DNR brings large performance improvements to the baselines, achieving over 4.4% absolute margins in top-1 accuracy without training bells and whistles. More experiments on 3D backbones and several latest 2D spatial-temporal networks further validate its effectiveness.
Researcher Affiliation Industry Dongqi Cai* Intel Labs China dongqi.cai@intel.com Anbang Yao Intel Labs China anbang.yao@intel.com Yurong Chen Intel Labs China yurong.chen@intel.com
Pseudocode No The paper does not contain a block explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code Yes Code will be available at https://github.com/caidonkey/dnr.
Open Datasets Yes Four public datasets, including Kinetics-400 [5], Kinetics-200 [65], Something-Something (Sth-Sth) V1 [18] and V2 [40], are considered in the experiments.
Dataset Splits No On Sth-Sth V1&V2 (with shorter video durations compared to Kinetics), we follow the pre-training (on Kinetics-400) and fine-tuning strategy, and report 1 clip and center-crop testing accuracy on validation set. Detailed implementations are in the supplemental material.
Hardware Specification No The paper states: 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Section 4 and supplemental material.' However, Section 4 in the provided text does not contain specific hardware details like GPU models, CPU types, or cloud providers.
Software Dependencies No For fair comparisons, we choose MMaction21 for implementing all methods. 1https://github.com/open-mmlab/mmaction2. No specific version number for MMaction2 is provided.
Experiment Setup Yes we adopt cosine schedule of learning rate decaying and use a linear warm-up strategy with warm-up ratio of 0.01 in the first 60K/128K iterations. [...] under the batch size of 8/6/4, the learning rate is initialized to 0.1/0.075/0.05 and decayed with a cosine scheduling. We also use a linear warm-up strategy with warm-up ratio of 0.01 in the first 60K iterations. [...] Our LSTM structure has two basic hyper-parameters: reduction ratio r and the number of bottleneck units d. [...] with r = 4 and d = 1 which are used as the default settings.