reproducibilityindex.ai

EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens

Authors: Sunil Hwang, Jaehong Yoon, Youngwan Lee, Sung Ju Hwang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5. Experiments. We extensively validate our proposed method on multiple benchmark datasets, including UCF101, HMDB51, K400, Something-Something V2, and Ego4D, and our EVEREST shows remarkable efficiency in terms of memory occupancy, computational cost, and training time compared to strong counterparts, achieving competitive performance.
Researcher Affiliation	Collaboration	1Korea Military Academy 2UNC Chapel Hill 3KAIST 4ETRI 5Deep Auto.
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper refers to the Video MAE repository as the base for their implementation but does not explicitly state that the code for EVEREST (their proposed method) is open-source or provide a link to it.
Open Datasets	Yes	We extensively validate our proposed method on multiple benchmark datasets, including UCF101 (Soomro et al., 2012), HMDB51 (Kuehne et al., 2011), Something-Something v2 (SSv2) (Goyal et al., 2017), Kinetics-400 (K400) (Kay et al., 2017) and Ego4D (Grauman et al., 2022).
Dataset Splits	Yes	The OSCC dataset is the subset of the Ego4d dataset, consisting of 41.1k/21.2k train/val 8-second videos.
Hardware Specification	Yes	enabling the pre-training and fine-tuning on a single machine with 8 GPUs; using one node equipped with 8 A100 (80GB) GPUs; a single-node machine equipped with 8 A6000 (48GB) GPUs; 4 NVIDIA RTX 3090 GPUs are used.
Software Dependencies	No	The paper mentions software components like 'Adam W optimizer' and 'Cosine decay' but does not provide specific version numbers for these or other key software dependencies (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	Table 7: Pre-training settings for K400, SSv2, UCF101, HMDB51 and OSCC. and Table 8: Fine-tuning settings for K400, SSv2, UCF101, HMDB51 and OSCC. These tables specify detailed hyperparameters like optimizer, learning rate, batch size, warmup epochs, masking ratios, and augmentation strategies.