Temporal Interlacing Network
Authors: Hao Shao, Shengju Qian, Yu Liu11966-11973
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We theoretically prove that with a learnable interlacing target, TIN performs equivalently to the regularized temporal convolution network (r-TCN), but gains 4% more accuracy with 6x less latency on 6 challenging benchmarks. Exhaustive experiments further demonstrate the proposed TIN gains 4% more accuracy with 6x less latency, and finally be the new state-of-the-art method. In this section, we demonstrate the effectiveness of the proposed TIN on many video datasets. We first introduce the datasets used in our experiments. Then we provide a quantitative analysis with 2D CNN baseline and TSM (Lin, Gan, and Han 2018). We also perform comparisons with the sota results on the dataset Something (V1 & V2). To conclude, we conduct ablation experiments about our TIN and study the functionality of our design. |
| Researcher Affiliation | Collaboration | Hao Shao,1,2 Shengju Qian,3 Yu Liu3 1Tsinghua University, 2Sense Time X-Lab, 3The Chinese University of Hong Kong |
| Pseudocode | No | No explicit pseudocode or algorithm blocks are provided in the paper; the method is described using prose and mathematical equations. |
| Open Source Code | Yes | Code is made available to facilitate further research.1 https://github.com/deepcs233/TIN |
| Open Datasets | Yes | Datasets we conduct experiments on six video recognition datasets, including Something-Something (V1 & V2) (Goyal et al. 2017), Kinetics-600 (Carreira and Zisserman 2017) (Carreira et al. 2018), UCF101 (Soomro, Zamir, and Shah 2012), HMDB51 (Carreira and Zisserman 2017), Multi-Moments in Time (Monfort et al. 2019) and Jester datasets. |
| Dataset Splits | No | The paper mentions evaluating on 'validation' sets (e.g., 'Val Top-1 Val Top-5' in tables) for datasets like Something V1 and Kinetics-600, and refers to using 'training videos' for Kinetics-600, but it does not explicitly state the specific training/validation/test split percentages or sample counts for any of the datasets used. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using a 'mini-batch stochastic gradient descent' optimizer and pre-trained models, but it does not provide specific version numbers for any software components, libraries, or programming languages used (e.g., Python version, PyTorch version). |
| Experiment Setup | Yes | We set the dropout rate (Srivastava et al. 2014) to 0.5 and set weight decay to 5e-4. We use the mini-batch stochastic gradient descent (Bottou 2010) algorithm with a momentum of 0.9 as our optimizer. The initial learning rate is set to 0.005 and divided by 10 at 10, 20 epochs, which stops at 25 epochs. |