ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation

Authors: Dayoung Gong, Suha Kwak, Minsu Cho

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate the bi-directional benefits between action segmentation and anticipation. Act Fusion achieves the state-of-the-art performance across the standard benchmarks of 50 Salads, Breakfast, and GTEA, outperforming task-specific models in both of the two tasks with a single unified model through joint learning.
Researcher Affiliation Academia Dayoung Gong Suha Kwak Minsu Cho Pohang University of Science and Technology (POSTECH) {dayoung.gong, suha.kwak, mscho}@postech.ac.kr
Pseudocode Yes We provide training algorithms of Act Fusion in Alg. 1 and inference algorithms for TAS and LTA in Alg. 2 and Alg. 3, respectively.
Open Source Code Yes We include the code and instructions for reproduction in the supplementary. The training and validation data is available online.
Open Datasets Yes We evaluate our method on three widely-used benchmark datasets: 50 Salads [58], Breakfast [36], and GTEA [21] (see Sec. F for details).
Dataset Splits Yes The dataset is partitioned into 5 splits for cross-validations, and we report the average performance across all splits.
Hardware Specification Yes All experiments are conducted on a single NVIDIA RTX-3080 GPU.
Software Dependencies No The paper states, 'We implement Act Fusion using Pytorch [47] and some of the official code repository of Diff Act [43]3 licensed under an MIT License.', but it does not specify the version numbers for Pytorch or any other software libraries used.
Experiment Setup Yes Table S5 presents the specific hyperparameters used in our experiments for each dataset.