Self-Supervised Learning of Compressed Video Representations
Authors: Youngjae Yu, Sangho Lee, Gunhee Kim, Yale Song
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that our approach achieves competitive performance on compressed video recognition both in supervised and self-supervised regimes. ... 3 EXPERIMENTS ... Table 1 summarizes the results. ... Table 3 summarizes the results. |
| Researcher Affiliation | Collaboration | Youngjae Yu , Sangho Lee , Gunhee Kim Seoul National University {yj.yu,sangho.lee}@vision.snu.ac.kr, gunhee@snu.ac.kr Yale Song Microsoft Research yalesong@microsoft.com |
| Pseudocode | Yes | Algorithm 1: Self-supervision label for Pyramidal Motion Statistics Prediction |
| Open Source Code | No | No explicit statement about the authors providing open-source code for their methodology or a link to a code repository was found. |
| Open Datasets | Yes | We pretrain our model on Kinetics-400 (Kay et al., 2017). For evaluation, we finetune the pretrained model for action recognition using UCF-101 (Soomro et al., 2012) and HMDB-51 (Kuehne et al., 2011). |
| Dataset Splits | Yes | We use the standard training and evaluation protocols for both UCF-101 (Soomro et al., 2012) and HMDB-51 (Kuehne et al., 2011). |
| Hardware Specification | Yes | We use 4 NVIDIA Tesla V100 GPUs and use a batch size of 100. ... Table 2 shows per-frame runtime speed (ms) and GFLOPs measured on an NVIDIA Tesla P100 GPU with Intel E5-2698 v4 CPUs |
| Software Dependencies | No | No specific software versions (e.g., Python, PyTorch, CUDA versions) were mentioned, only general software components like '3D Res Net' and 'SGD'. |
| Experiment Setup | Yes | We pretrain our model end-to-end from scratch for 20 epochs, including the initial warm-up period of 5 epochs. For downstream scenarios, we finetune our model for 500 epochs for UCF-101 and for 300 epochs for HMDB-51, including the warm-up period of 30 epochs. For both the pretraining and finetuning stages, we use SGD with momentum 0.9, weight decay 10 4, and half-period cosine learning rate schedule. We use 4 NVIDIA Tesla V100 GPUs and use a batch size of 100. |