Recurrent Space-time Graph Neural Networks
Authors: Andrei Nicolicioiu, Iulia Duta, Marius Leordeanu
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate, through extensive experiments and ablation studies, that our model outperforms strong baselines and top published methods on recognizing complex activities in video. Moreover, we obtain state-of-the-art performance on the challenging Something-Something human-object interaction dataset. 3 Experiments We perform experiments on two video classification tasks, which involve complex object interactions. |
| Researcher Affiliation | Collaboration | Andrei Nicolicioiu , Iulia Duta Bitdefender, Romania anicolicioiu, iduta@bitdefender.com Marius Leordeanu Bitdefender, Romania Institute of Mathematics of the Romanian Academy University "Politehnica" of Bucharest marius.leordeanu@imar.ro |
| Pseudocode | Yes | Algorithm 1 Space-time processing in RSTG model. |
| Open Source Code | Yes | The code for the full model can be found in our repository 2. https://github.com/Iulia Duta/RSTG |
| Open Datasets | Yes | We experiment on a video dataset that we create synthetically, containing complex patterns of movements and shapes, and on the challenging Something-Something-v1 dataset, involving interactions between a human and other objects [54]. [54] Raghav Goyal, Samira Ebrahimi Kahou, Vincent Michalski, Joanna Materzynska, Susanne Westphal, Heuna Kim, Valentin Haenel, Ingo Fruend, Peter Yianilos, Moritz Mueller-Freitag, et al. The" something something" video database for learning and evaluating visual common sense. In ICCV, volume 1, page 3, 2017. |
| Dataset Splits | Yes | It consists of a collection of 108499 videos with 86017, 11522 and 10960 videos for train, validation and test splits respectively. |
| Hardware Specification | Yes | We show the compute times for different variants of our model and for the Non-Local model using the Resnet-50 backbone on Something-Something videos running on one Nvidia GTX 1080 Ti GPU in Figure 4. |
| Software Dependencies | No | We implement our model in Tensorflow framework [58]. The paper mentions TensorFlow but does not provide specific version numbers for it or any other software dependencies. |
| Experiment Setup | Yes | We use cross-entropy as loss function and trained the model end-to-end with SGD with Nesterov Momentum with value 0.9 for momentum, starting from a learning rate of 0.0001 and decreasing by a factor of 10 when performance saturates. |