reproducibilityindex.ai

Hyperbolic Self-paced Learning for Self-supervised Skeleton-based Action Representations

Authors: Luca Franco, Paolo Mandica, Bharti Munjal, Fabio Galasso

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	When tested on three established skeleton-based action recognition datasets, HYSP outperforms the state-of-the-art on PKU-MMD I, as well as on 2 out of 3 downstream tasks on NTU-60 and NTU-120. Code is available at https://github.com/paolomandica/HYSP.
Researcher Affiliation	Academia	Luca Franco 1 Paolo Mandica 1 Bharti Munjal1,2 Fabio Galasso1 1Sapienza University of Rome 2Technical University of Munich
Pseudocode	No	The paper includes mathematical equations for its model, but no structured pseudocode or algorithm blocks are provided.
Open Source Code	Yes	Code is available at https://github.com/paolomandica/HYSP.
Open Datasets	Yes	NTU RGB+D 60 Dataset (Shahroudy et al., 2016). This contains 56,578 video sequences divided into 60 action classes, captured with three concurrent Kinect V2 cameras from 40 distinct subjects. The dataset follows two evaluation protocols: cross-subject (xsub), where the subjects are split evenly into train and test sets, and cross-view (xview), where the samples of one camera are used for testing while the others for training.
Dataset Splits	Yes	The dataset follows two evaluation protocols: cross-subject (xsub), where the subjects are split evenly into train and test sets, and cross-view (xview), where the samples of one camera are used for testing while the others for training. (...) Semi-supervised Protocol. Encoder and linear classifier are finetuned with 10% of the labeled data.
Hardware Specification	Yes	Training on 4 Nvidia Tesla A100 GPUs takes approximately 8 hours.
Software Dependencies	No	The paper mentions software components like 'ST-GCN', 'Riemannian SGD', 'BYOL', and 'SGD optimizer', but does not provide specific version numbers for these software dependencies or the underlying framework like PyTorch.
Experiment Setup	Yes	The encoder f is ST-GCN (Yu et al., 2018) with output dimension 1024. Following BYOL (Grill et al., 2020), the projector and predictor MLPs are linear layers with dimension 1024, followed by batch normalization, Re LU and a final linear layer with dimension 1024. The model is trained with batch size 512 and learning rate lr 0.2 in combination with Riemannian SGD (Kochurov et al., 2020) optimizer with momentum 0.9 and weight decay 0.0001. For curriculum learning, across all experiments, we set e1 = 50 and e2 = 100 in Eq. 8. (...) In downstream evaluation, the model is trained for 100 epochs using SGD optimizer with momentum 0.9 and weight decay 0.