Iterative Contrast-Classify for Semi-supervised Temporal Action Segmentation
Authors: Dipika Singhania, Rahul Rahaman, Angela Yao2262-2270
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Temporal action segmentation classifies the action of each frame in (long) video sequences. Due to the high cost of framewise labeling, we propose the first semi-supervised method for temporal action segmentation. Our method hinges on unsupervised representation learning, which, for temporal action segmentation, poses unique challenges... We develop an Iterative Contrast-Classify (ICC) semi-supervised learning scheme. With more labelled data, ICC progressively improves in performance; ICC semi-supervised learning, with 40% labelled videos, performs similar to fully-supervised counterparts. Our ICC improves Mo F by {+1.8, +5.6, +2.5}% on Breakfast, 50Salads and GTEA respectively for 100% labelled videos. |
| Researcher Affiliation | Academia | Dipika Singhania , Rahul Rahaman , Angela Yao National University of Singapore dipika16@comp.nus.edu.sg, rahul.rahaman@u.nus.edu, ayao@comp.nus.edu.sg |
| Pseudocode | No | The paper does not contain any explicit pseudocode blocks or sections labeled 'Algorithm'. |
| Open Source Code | No | The provided text does not contain any statement about open-sourcing the code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | We test on Breakfast Actions (Kuehne, Arslan, and Serre 2014) (1.7k videos, 10 complex activities, 48 actions), 50Salads (Stein and Mc Kenna 2013) (50 videos, 19 actions) and GTEA (Fathi, Ren, and Rehg 2011) (28 videos, 11 actions). |
| Dataset Splits | No | The paper mentions 'specified train-test splits' and refers to 'labelled dataset DL' and 'unlabelled videos DU' within the training process. However, it does not explicitly specify a distinct validation set with percentages or sample counts, nor does it describe a methodology for creating one for hyperparameter tuning. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or cloud computing specifications. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific libraries with their versions). |
| Experiment Setup | Yes | We sample frames from each video with K = {20, 60, 20} partitions, ε 1 3K for sampling, and temporal proximity δ = {0.03, 0.5, 0.5} for Breakfast, 50Salads, and GTEA respectively. The contrastive temperature τ in Eqs. (3) and (4) is set to 0.1. |