Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Discovering Dynamic Salient Regions for Spatio-Temporal Graph Neural Networks
Authors: Iulia Duta, Andrei Nicolicioiu, Marius Leordeanu
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In extensive ablation studies and experiments on two challenging datasets, we show superior performance to previous graph neural networks models for video classification. |
| Researcher Affiliation | Collaboration | Iulia Duta Bitdefender, Romania EMAIL Andrei Nicolicioiu* Bitdefender, Romania EMAIL Marius Leordeanu Bitdefender, Romania Institute of Mathematics of the Romanian Academy University "Politehnica" of Bucharest EMAIL |
| Pseudocode | No | The paper describes the model using equations and text but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code for our method can be found in our repository 2. 2https://github.com/bit-ml/Dy Reg-GNN |
| Open Datasets | Yes | We test our model on two video classification datasets that seem to offer the best advantages, being large enough and requiring abilities to model complex interactions. We evaluate on real-world datasets, Something-Something-V1&V2 [74], while we also test on a variant of the Sync MNIST [3] dataset |
| Dataset Splits | Yes | Something-Something-V1&V2 [74] datasets classify scenes involving human-object complex interactions. They consist of 86K / 169K training videos and 11K / 25K validation videos, having 174 classes. and The dataset contains 600k training videos and 10k validation videos with 10k validation videos with 10 frames each. |
| Hardware Specification | No | The paper states training was done "on two GPUs" or "on a single GPU" but does not specify the exact GPU models, CPU, memory, or any other specific hardware components used for the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any ancillary software dependencies (e.g., programming languages, libraries, or frameworks like Python, PyTorch, or TensorFlow). |
| Experiment Setup | Yes | In all experiments we follow the training setting of [67], using 16 frames resized to have the shorter side of size 256, and randomly sample a crop of size 224 224. For the evaluations, we follow the setting in [67] of taking 3 spatial crops of size 256 256 with 2 temporal samplings and averaging their results. For training, we use SGD optimizer with learning rate 0.001 and momentum 0.9, using a total batch-size of 10, trained on two GPUs. We decrease the learning rate by a factor of 10 three times when the optimisation reaches a plateau. |