reproducibilityindex.ai

Temporal Gaussian Mixture Layer for Videos

Authors: Aj Piergiovanni, Michael Ryoo

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The extensive experiments on multiple datasets, including Charades and Multi THUMOS, conﬁrm the effectiveness of TGM layers, signiﬁcantly outperforming the state-of-the-arts1.
Researcher Affiliation	Academia	1Department of Computer Science, Indiana University. Correspondence to: AJ Piergiovanni <ajpiergi@indiana.edu>, Michael Ryoo <mryoo@indiana.edu>.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks (e.g., a figure or section labeled "Algorithm" or "Pseudocode").
Open Source Code	Yes	1Code/models: https://github.com/piergiaj/tgm-icml19
Open Datasets	Yes	We conducted our experiments on both THUMOS (Jiang et al., 2014) and Multi THUMOS (Yeung et al., 2015) datasets... Charades (Sigurdsson et al., 2016b) is a large scale dataset...
Dataset Splits	Yes	There are 1010 validation videos and 1574 test videos. We used these continuous validation videos for the training of our models.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. It mentions using I3D and Inception V3 as base CNNs but no information on the computational resources.
Software Dependencies	No	The paper mentions using I3D (Carreira & Zisserman, 2017) and Inception V3 (Szegedy et al., 2016) and their pretraining datasets (Imagenet, Kinetics), but does not provide specific version numbers for these software components or any other libraries/frameworks like PyTorch or TensorFlow.
Experiment Setup	Yes	Our default L setting used for the TGM layers as well as the other baselines was as follows: when using I3D segment features (3 features per second from 24fps videos), the 1 layer models used L = 15 and the 3 layer models used L = 5. When using Inception V3 frame feature (at 8 fps), the 1 layer models used L = 30 and the 3 layer models used L = 10.