Dynamic Normalization and Relay for Video Action Recognition
Authors: Dongqi Cai, Anbang Yao, Yurong Chen
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that DNR brings large performance improvements to the baselines, achieving over 4.4% absolute margins in top-1 accuracy without training bells and whistles. More experiments on 3D backbones and several latest 2D spatial-temporal networks further validate its effectiveness. |
| Researcher Affiliation | Industry | Dongqi Cai* Intel Labs China dongqi.cai@intel.com Anbang Yao Intel Labs China anbang.yao@intel.com Yurong Chen Intel Labs China yurong.chen@intel.com |
| Pseudocode | No | The paper does not contain a block explicitly labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | Code will be available at https://github.com/caidonkey/dnr. |
| Open Datasets | Yes | Four public datasets, including Kinetics-400 [5], Kinetics-200 [65], Something-Something (Sth-Sth) V1 [18] and V2 [40], are considered in the experiments. |
| Dataset Splits | No | On Sth-Sth V1&V2 (with shorter video durations compared to Kinetics), we follow the pre-training (on Kinetics-400) and fine-tuning strategy, and report 1 clip and center-crop testing accuracy on validation set. Detailed implementations are in the supplemental material. |
| Hardware Specification | No | The paper states: 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Section 4 and supplemental material.' However, Section 4 in the provided text does not contain specific hardware details like GPU models, CPU types, or cloud providers. |
| Software Dependencies | No | For fair comparisons, we choose MMaction21 for implementing all methods. 1https://github.com/open-mmlab/mmaction2. No specific version number for MMaction2 is provided. |
| Experiment Setup | Yes | we adopt cosine schedule of learning rate decaying and use a linear warm-up strategy with warm-up ratio of 0.01 in the first 60K/128K iterations. [...] under the batch size of 8/6/4, the learning rate is initialized to 0.1/0.075/0.05 and decayed with a cosine scheduling. We also use a linear warm-up strategy with warm-up ratio of 0.01 in the first 60K iterations. [...] Our LSTM structure has two basic hyper-parameters: reduction ratio r and the number of bottleneck units d. [...] with r = 4 and d = 1 which are used as the default settings. |