ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation
Authors: Dayoung Gong, Suha Kwak, Minsu Cho
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate the bi-directional benefits between action segmentation and anticipation. Act Fusion achieves the state-of-the-art performance across the standard benchmarks of 50 Salads, Breakfast, and GTEA, outperforming task-specific models in both of the two tasks with a single unified model through joint learning. |
| Researcher Affiliation | Academia | Dayoung Gong Suha Kwak Minsu Cho Pohang University of Science and Technology (POSTECH) {dayoung.gong, suha.kwak, mscho}@postech.ac.kr |
| Pseudocode | Yes | We provide training algorithms of Act Fusion in Alg. 1 and inference algorithms for TAS and LTA in Alg. 2 and Alg. 3, respectively. |
| Open Source Code | Yes | We include the code and instructions for reproduction in the supplementary. The training and validation data is available online. |
| Open Datasets | Yes | We evaluate our method on three widely-used benchmark datasets: 50 Salads [58], Breakfast [36], and GTEA [21] (see Sec. F for details). |
| Dataset Splits | Yes | The dataset is partitioned into 5 splits for cross-validations, and we report the average performance across all splits. |
| Hardware Specification | Yes | All experiments are conducted on a single NVIDIA RTX-3080 GPU. |
| Software Dependencies | No | The paper states, 'We implement Act Fusion using Pytorch [47] and some of the official code repository of Diff Act [43]3 licensed under an MIT License.', but it does not specify the version numbers for Pytorch or any other software libraries used. |
| Experiment Setup | Yes | Table S5 presents the specific hyperparameters used in our experiments for each dataset. |