reproducibilityindex.ai

MOMA: Multi-Object Multi-Actor Activity Parsing

Authors: Zelun Luo, Wanze Xie, Siddharth Kapoor, Yiyun Liang, Michael Cooper, Juan Carlos Niebles, Ehsan Adeli, Fei-Fei Li

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we examine three different tasks using the MOMA dataset. ... For all three experiments, we split 80% of data for training, and hold out the rest for validation. The major hyperparameter we focus on is the loss weight assigned for each output head, while we also tune on model hidden size and initial learning rate. We use an 8-GPU (Tesla V100) environment for training video feature extractor and ﬁne-tuning on the action hypergraph. We demonstrate the signiﬁcance of action hypergraph representation through these three experiments and the carefully designed baselines.
Researcher Affiliation	Academia	Zelun Luo , Wanze Xie , Siddharth Kapoor, Yiyun Liang, Michael Cooper, Juan Carlos Niebles, Ehsan Adeli, Li Fei-Fei Stanford University {alanzluo, wanzexie, siddkap, isaliang, coopermj, jniebles, eadeli, feifeili}@stanford.edu
Pseudocode	No	The paper includes a structural diagram of the HGAP model (Figure 4) and describes its components, but it does not provide pseudocode or an algorithm block.
Open Source Code	No	Code, data, and further instructions will be released at https://moma.stanford.edu/.
Open Datasets	Yes	Lastly, we introduce the MOMA (Multi-Object, Multi-Actor) dataset... Yes, the proposed MOMA dataset will be released at https://moma.stanford.edu/.
Dataset Splits	Yes	For all three experiments, we split 80% of data for training, and hold out the rest for validation.
Hardware Specification	Yes	We use an 8-GPU (Tesla V100) environment for training video feature extractor and ﬁne-tuning on the action hypergraph.
Software Dependencies	No	The paper does not provide specific software names with version numbers for its dependencies (e.g., Python 3.8, PyTorch 1.9).
Experiment Setup	Yes	The major hyperparameter we focus on is the loss weight assigned for each output head, while we also tune on model hidden size and initial learning rate. ... The input to the video model includes a trimmed video sequence v = {i(1), i(2), . . . , i(N)} of N image frames with sample rate of 8 with N = 16. We pretrain the X3D_L model on Kinetics400 [30]...