Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Activity Grammars for Temporal Action Segmentation

Authors: Dayoung Gong, Joonseok Lee, Deunsol Jung, Suha Kwak, Minsu Cho

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that our method significantly improves temporal action segmentation in terms of both performance and interpretability on two standard benchmarks, Breakfast and 50 Salads.
Researcher Affiliation	Academia	Pohang University of Science and Technology (POSTECH) EMAIL
Pseudocode	Yes	Algorithm 1 shows the parsing procedure of BEP.
Open Source Code	No	The paper does not include an explicit statement or a direct link to an open-source code repository for the methodology described in the paper. The URL provided in the affiliation is a general lab research page, not a specific code release.
Open Datasets	Yes	We conduct experiments on two widely used benchmark datasets for temporal action segmentation: Breakfast [23] and 50 Salads [39].
Dataset Splits	No	The paper mentions using training data and unseen data for grammar evaluation, but it does not explicitly detail the train/validation/test splits (e.g., percentages, sample counts, or a reference to a specific standard split name) for the models used in the experiments.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments, such as specific GPU or CPU models. It mentions using 'I3D [4] features' and existing models [9, 45], but no details on their own computational resources.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies or libraries used in the experiments (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	For KARI, we set the hyperparameters of the number of key actions N key to 4 for Breakfast, and 3 for 50 Salads. ... For BEP, we configured the queue size N queue to be 20. For efficiency, we adjust the sampling rate of the input video features to 50 for Breakfast and 100 for 50 Salads.