reproducibilityindex.ai

Compressed Video Contrastive Learning

Authors: Yuqi Huo, Mingyu Ding, Haoyu Lu, Nanyi Fei, Zhiwu Lu, Ji-Rong Wen, Ping Luo

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive experiments on two downstream tasks show that our MVCGC yields new state-of-the-art while being signiﬁcantly more efﬁcient than its competitors.
Researcher Affiliation	Academia	1Gaoling School of Artiﬁcial Intelligence, Renmin University of China, Beijing, China 2Beijing Key Laboratory of Big Data Management and Analysis Methods 3School of Information, Renmin University of China, Beijing, China 4The University of Hong Kong, Pokfulam, Hong Kong, China
Pseudocode	Yes	Algorithm 1 Motion Vector based Cross Guidance Contrastive Learning (MVCGC)
Open Source Code	No	The paper does not explicitly state that the source code for the described methodology is publicly available, nor does it provide a link to a code repository.
Open Datasets	Yes	In this paper, we use UCF101 [Soomro et al., 2012] and Kinetics-400 (K400) [Kay et al., 2017] for self-supervised pre-training. ...we benchmark downstream evaluation tasks on the ﬁrst test set of UCF101, and the test split 1 of HMDB51 [Kuehne et al., 2011], a relatively small action dataset containing 6,766 videos with 51 categories.
Dataset Splits	Yes	K400 is a larger dataset consisting of 400 human action classes and has 230k/20k clips for training/validation, respectively. ... UCF101 contains 13,320 videos with 101 action classes and has three standard training/test splits. ... HMDB51 [Kuehne et al., 2011], a relatively small action dataset containing 6,766 videos with 51 categories.
Hardware Specification	Yes	All experiments are trained on 4 Ti Tan RTX GPUs, with a batch size of 32 samples per GPU. All methods are measured in exactly the same environment: Intel Xeon 5118 CPUs and a Titan RTX GPU.
Software Dependencies	No	The paper mentions 'pyav' and 'FFmpeg libraries' but does not specify any version numbers for these or other software dependencies.
Experiment Setup	Yes	For the pre-training on UCF101, temperature τ = 0.07, momentum m = 0.999 and queue size 2048 are used, while queue size is set to 16384 on K400. When pre-training on UCF101, the initialization stage lasts 300 epochs for each stream, and we then continually train the cross guidance for another 200 epochs. On K400, we train 200 epochs for each stream in the initialization stage and 50 epochs for cross guidance contrastive learning. 100 and 500 epochs are used for linear and fully ﬁne-tuning, respectively. We use the Adam optimizer with a 1e-4 learning rate and 1e-5 weight decay for pre-training and the SGD optimizer with a 1e-1 learning rate and 1e-3 weight decay for ﬁne-tuning. The learning rate is decayed down by 1/10 twice when the validation loss plateaus. The hyper-parameter k in MVCGC is set as 5 according to the ablation study. All experiments are trained on 4 Ti Tan RTX GPUs, with a batch size of 32 samples per GPU.