Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Storyboard-guided Alignment for Fine-grained Video Action Recognition

Authors: Enqi Liu, Liyuan Pan, Yan Yang, Yiran Zhong, Zhijing Wu, Xinxiao Wu, Liu Liu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on various video action recognition datasets demonstrate the competitive performance of our SFAR in supervised, few-shot, and zero-shot settings.
Researcher Affiliation	Collaboration	1Beijing Institute of Technology, Beijing, China 2Yangtze Delta Region Academy of Beijing Institute of Technology, Jiaxing, China 3BDSI, Australian National University, Canberra, Australia 4Open NLPLab, Shanghai, China 5Huawei, Beijing, China EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology in narrative text and uses diagrams (e.g., Figure 3) to illustrate the framework, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: If the paper is accepted, the code and data will be released.
Open Datasets	Yes	Datasets. We experiment across four extensively recognized video benchmarks: Kinetics-400 [17], Charades [33], UCF-101 [34], and HMDB-51 [19] datasets.
Dataset Splits	Yes	The Kinetics-400 dataset is curated from You Tube, spans 400 action classes, and contains 240,000 training videos and 20,000 validation videos. The HMDB-51 dataset consists of 6,766 videos categorized into 51 action classes, with 3,570 videos used for training, and 1,530 videos used for testing. ... We follow [39, 26] to experiment with different shot settings, selecting 2, 4, 8, and 16 examples per human action category for training.
Hardware Specification	No	The paper mentions 'compute workers CPU or GPU, internal cluster, or cloud provider' in the NeurIPS checklist, but it does not specify the actual hardware models (e.g., specific GPU or CPU models, memory details) used for running its experiments.
Software Dependencies	No	Our model is implemented using the Py Torch framework.
Experiment Setup	Yes	Our model is implemented using the Py Torch framework. We train our network with batch size 256 for 30 epochs using the Adam W optimizer. The learning rate is set to 5 10 5, and we use the cosine annealing strategy with 5 warm-up epochs. We follow [48] for data augmentation in training. (Further details are provided in Table 5: Optimisation Optimizer Adam W, Optimizer betas (0.9, 0.999), Batch size 256, Learning rate schedule cosine, Linear warmup epochs 5, Base learning rate 5e-6, Epochs 30 for fully-sup, 2,10 for few-shot, Weight decay 0.02).