RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning

Authors: Peihao Chen, Deng Huang, Dongliang He, Xiang Long, Runhao Zeng, Shilei Wen, Mingkui Tan, Chuang Gan1045-1053

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiments", "We compare our method with the state-of-the-art self-supervised learning methods in Table 2. We report top-1 accuracy on the UCF101 and HMDB51 datasets together with the backbone and pre-training dataset.", "Table 1: Comparison of different pre-training settings on UCF101 and HMDB51 datasets.
Researcher Affiliation Collaboration 1School of Software Engineering, South China University of Technology, 2Baidu Inc., 3MIT-IBM Watson AI Lab 4Key Laboratory of Big Data and Intelligent Robot, Ministry of Education, 5Pazhou Laboratory
Pseudocode Yes Algorithm 1 Training method of RSPNet
Open Source Code Yes Our code, pre-trained models, and supplementary materials can be found at https://github.com/Peihao Chen/RSPNet.
Open Datasets Yes We pre-train models on the training set of the Kinetics-400 dataset (Carreira and Zisserman 2017)", "The UCF101 (Soomro, Zamir, and Shah 2012) dataset consists of 13,320 videos from 101 realistic action categories on You Tube.", "The HMDB51 (Kuehne et al. 2011) dataset consists of 6,849 clips from 51 action classes.", "The Something-Something-V2 (Something-V2) dataset (Goyal et al. 2017) contains 220,847 videos with 174 classes and focuses more on modeling temporal relationships (Lin, Gan, and Han 2019).
Dataset Splits No The paper mentions pre-training on Kinetics-400 training set and fine-tuning on UCF101, HMDB51, and Something-V2, but does not explicitly provide train/validation/test split percentages or sample counts for its own experiments, nor does it cite specific standard splits being used for reproduction.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions using SGD as an optimizer but does not provide specific ancillary software details such as programming languages or deep learning framework versions.
Experiment Setup Yes We use SGD as the optimizer with a minibatch size of 64. We train the model for 200 epochs by default. The learning rate policy is linear cosine decay starting from 0.1. Following He et al. (2020), we set τ = 0.07, K = 16384, γ = 0.15 and λ = 1 for Equations (1), (2) and (3)."; "We fine-tune our RSPNet on UCF101, HMDB51, and Something-V2 with labeled videos for action recognition. We train for 30, 70 and 50 epochs on these datasets, respectively, with a learning rate of 0.01.