Activity Grammars for Temporal Action Segmentation

Authors: Dayoung Gong, Joonseok Lee, Deunsol Jung, Suha Kwak, Minsu Cho

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that our method significantly improves temporal action segmentation in terms of both performance and interpretability on two standard benchmarks, Breakfast and 50 Salads.
Researcher Affiliation Academia Pohang University of Science and Technology (POSTECH) {dayoung.gong, jameslee, deunsol.jung, suha.kwak, mscho}@postech.ac.kr
Pseudocode Yes Algorithm 1 shows the parsing procedure of BEP.
Open Source Code No The paper does not include an explicit statement or a direct link to an open-source code repository for the methodology described in the paper. The URL provided in the affiliation is a general lab research page, not a specific code release.
Open Datasets Yes We conduct experiments on two widely used benchmark datasets for temporal action segmentation: Breakfast [23] and 50 Salads [39].
Dataset Splits No The paper mentions using training data and unseen data for grammar evaluation, but it does not explicitly detail the train/validation/test splits (e.g., percentages, sample counts, or a reference to a specific standard split name) for the models used in the experiments.
Hardware Specification No The paper does not explicitly describe the hardware used to run its experiments, such as specific GPU or CPU models. It mentions using 'I3D [4] features' and existing models [9, 45], but no details on their own computational resources.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used in the experiments (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes For KARI, we set the hyperparameters of the number of key actions N key to 4 for Breakfast, and 3 for 50 Salads. ... For BEP, we configured the queue size N queue to be 20. For efficiency, we adjust the sampling rate of the input video features to 50 for Breakfast and 100 for 50 Salads.