MEID: Mixture-of-Experts with Internal Distillation for Long-Tailed Video Recognition

Authors: Xinjie Li, Huijuan Xu

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments are conducted on the existing long-tailed video benchmark Video LT and the two new benchmarks to verify the effectiveness of our proposed method with state-of-the-art performance.
Researcher Affiliation Academia Pennsylvania State University, University Park, USA {xql5497, hkx5063}@psu.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. The methods are described in prose and mathematical equations.
Open Source Code Yes The code and proposed benchmarks are released at https://github.com/Vision Language Lab/MEID.
Open Datasets Yes Video LT (Zhang et al. 2021b) is the only existing long-tailed video recognition benchmark... We propose two new benchmarks, Charades LT and Charades Ego LT, based on the videos from Charades dataset (Sigurdsson et al. 2016) and Charades Ego (Sigurdsson et al. 2018).
Dataset Splits Yes Table 1: Data statistics on three datasets. #Classes #Head #Medium #Tail #Training #Validation #Test #Label/#Video #IR Video LT (Zhang et al. 2021b) 1,004 47 617 340 179,334 25619 51,239 1.1 43.5 Charades LT (ours) 157 41 81 35 2,213 469 1,012 6.9 61.8 Charades Ego LT (ours) 157 43 89 25 1,553 764 748 9.6 53.8
Hardware Specification Yes Last, all the experiments are conducted on the Py Torch framework with NVIDIA A5000 GPUs.
Software Dependencies No The paper mentions 'Py Torch framework' but does not specify a version number for PyTorch or any other software dependency.
Experiment Setup Yes The Adam optimizer with an initial learning rate of 1e-3 is used to train the first stage. In the second stage, the initial learning rate of 1e-4 is used. Each stage is trained for 100 epochs, and the learning rate decayed at the 30th and 60th epochs by 0.1. We adopt focal loss (Lin et al. 2020) as our classification loss following the previous work (Zhang et al. 2021b). We set the loss weights λ1 in Eq. 13 and λ2 in Eq. 14 to 0.1.