reproducibilityindex.ai

Cycle-Contrast for Self-Supervised Video Representation Learning

Authors: Quan Kong, Wenpeng Wei, Ziwei Deng, Tomoaki Yoshinaga, Tomokazu Murakami

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we evaluate the effectiveness of our representation learning approach by using four datasets: Kinetics-400[12], UCF101[25], HMDB51[15] and MMAct[23] under standard evaluation protocols. The learned network backbones are evaluated via two tasks: nearest neighbour retrieval and action recognition.
Researcher Affiliation	Industry	Quan Kong, Wenpeng Wei, Ziwei Deng, Tomoaki Yoshinaga, Tomokazu Murakami Lumada Data Science Lab. Hitachi, Ltd. quan.kong.xz@hitachi.com
Pseudocode	No	The paper describes the method using text and diagrams (Figure 1), but no explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code or providing a link to a code repository.
Open Datasets	Yes	In this section, we evaluate the effectiveness of our representation learning approach by using four datasets: Kinetics-400[12], UCF101[25], HMDB51[15] and MMAct[23] under standard evaluation protocols.
Dataset Splits	Yes	We use the test split1 of UCF101 to evaluate our self-supervised method. ... We pre-train our network by using CCL on Kinetics-400 train split. ... The network is ﬁne-tuned end-to-end as other methods by 35 epochs for all test datasets.
Hardware Specification	No	The training was performed on 4 GPUs, taking 8 days on Kinetics-400 and 0.5 day on UCF101.
Software Dependencies	No	The paper mentions using SGD optimizer and an R3D architecture, but does not provide specific software library names with version numbers.
Experiment Setup	Yes	We constrain our experiments to a 3D Res Net. Table 1 provides the speciﬁcations of the network. It has L = 8 frames scaled to 128 x 171 and randomly cropped to the size 112 x 112 as the network input... The temperature parameter τ is set to 1 in Eq.2 and Eq.4. Balance parameters w1, w2 and w3 in Eq.6 are set to be 0.2, 0.2 and 0.4. The training was performed on 4 GPUs... Self-Training Phase. We train our network by using CCL on UCF101 train split 1. The mini-batch is set to 48 videos and use the SGD optimizer with learning rate 0.0001. We divide the leaning rate every 20 epochs by 10 for a total of 100 epochs. Weight decay is set to 0.005... We pre-train our network by using CCL on Kinetics-400 train split. The mini-batch is set to 48 videos and uses the SGD optimizer with learning rate 0.01. We divide the leaning rate every 20 epochs by 10 for a total 80 epochs. Weight decay is set to 0.0001. Fine-tune Phase... The network is ﬁne-tuned end-to-end as other methods by 35 epochs for all test datasets.