Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Video Token Merging for Long Video Understanding

Authors: Seon-Ho Lee, Jue Wang, Zhikang Zhang, David Fan, Xinyu Li

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results show that we achieve better or comparable performances on the LVU, COIN, and Breakfast datasets.
Researcher Affiliation	Collaboration	Seon-Ho Lee Korea University EMAIL Jue Wang Amazon AGI EMAIL Zhikang Zhang Amazon AGI EMAIL David Fan Meta FAIR EMAIL Xinyu Li Amazon AGI EMAIL
Pseudocode	No	The paper describes algorithmic steps and includes figures illustrating architectures (Figure 2, Figure 4, Figure 6), but it does not contain a dedicated pseudocode or algorithm block.
Open Source Code	No	The paper does not explicitly state that the code is publicly available, nor does it provide a link to a code repository. The NeurIPS checklist also indicates 'No' for open access to code.
Open Datasets	Yes	LVU (Wu & Krähenbühl, 2021): It contains 30K videos sampled from 3K movies on the Movie Clips (mov) website. Most videos are 1 to 3 minutes long... Breakfast (Kuehne et al., 2014): It provides 1,712 videos... COIN (Tang et al., 2019): It consists of 11,827 videos...
Dataset Splits	No	The paper uses standard datasets and mentions evaluation metrics, but it does not explicitly provide the specific training, validation, and test dataset splits with percentages or sample counts.
Hardware Specification	Yes	For experiments, we use 8 Tesla V100 GPUs and Py Torch.
Software Dependencies	No	The paper mentions using 'Py Torch' but does not provide a specific version number for it or other software dependencies.
Experiment Setup	Yes	We use the Adam W (Loshchilov & Hutter, 2017) optimizer with a batch size of 16 and a weight decay of 0.01. We set the learning rate to 0.001. We train the network for 70 epochs by using cosine learning rate scheduler (Gotmare et al., 2018) with 10 epochs warm-up.