Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
Authors: Sunil Hwang, Jaehong Yoon, Youngwan Lee, Sung Ju Hwang
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Experiments. We extensively validate our proposed method on multiple benchmark datasets, including UCF101, HMDB51, K400, Something-Something V2, and Ego4D, and our EVEREST shows remarkable efficiency in terms of memory occupancy, computational cost, and training time compared to strong counterparts, achieving competitive performance. |
| Researcher Affiliation | Collaboration | 1Korea Military Academy 2UNC Chapel Hill 3KAIST 4ETRI 5Deep Auto. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper refers to the Video MAE repository as the base for their implementation but does not explicitly state that the code for EVEREST (their proposed method) is open-source or provide a link to it. |
| Open Datasets | Yes | We extensively validate our proposed method on multiple benchmark datasets, including UCF101 (Soomro et al., 2012), HMDB51 (Kuehne et al., 2011), Something-Something v2 (SSv2) (Goyal et al., 2017), Kinetics-400 (K400) (Kay et al., 2017) and Ego4D (Grauman et al., 2022). |
| Dataset Splits | Yes | The OSCC dataset is the subset of the Ego4d dataset, consisting of 41.1k/21.2k train/val 8-second videos. |
| Hardware Specification | Yes | enabling the pre-training and fine-tuning on a single machine with 8 GPUs; using one node equipped with 8 A100 (80GB) GPUs; a single-node machine equipped with 8 A6000 (48GB) GPUs; 4 NVIDIA RTX 3090 GPUs are used. |
| Software Dependencies | No | The paper mentions software components like 'Adam W optimizer' and 'Cosine decay' but does not provide specific version numbers for these or other key software dependencies (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | Table 7: Pre-training settings for K400, SSv2, UCF101, HMDB51 and OSCC. and Table 8: Fine-tuning settings for K400, SSv2, UCF101, HMDB51 and OSCC. These tables specify detailed hyperparameters like optimizer, learning rate, batch size, warmup epochs, masking ratios, and augmentation strategies. |