Convolutional Tensor-Train LSTM for Spatio-Temporal Learning
Authors: Jiahao Su, Wonmin Byeon, Jean Kossaifi, Furong Huang, Jan Kautz, Anima Anandkumar
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments Here, we empirically evaluate our approach on several datasets for two different tasks video prediction and early activity recognition and find out it outperforms existing approaches. Evaluation. For video prediction, the model predicts every pixel in the frame. We test our proposed models on the KTH human action dataset [20] with resolution 128 128 and the Moving-MNIST-2 dataset [2] with resolution 64 64. |
| Researcher Affiliation | Collaboration | 1University of Maryland, College Park, MD 2NVIDIA Research, Santa Clara, CA |
| Pseudocode | Yes | The full procedure can be found in Appendix A (algorithm 2). |
| Open Source Code | Yes | Both versions are available online: https://github.com/NVlabs/conv-tt-lstm. |
| Open Datasets | Yes | We test our proposed models on the KTH human action dataset [20] with resolution 128 128 and the Moving-MNIST-2 dataset [2] with resolution 64 64. For early activity recognition, we evaluate our approach on the Something-Something V2 dataset. Following [7], we used the subset of 41 categories defined by Goyal et al. [21] (Table 7). |
| Dataset Splits | No | The paper states it validates hyper-parameters on a 'validation set' but does not provide specific split percentages or counts for training, validation, and test datasets needed to reproduce the experiment. |
| Hardware Specification | No | The paper mentions optimizing for 'GPUs' and 'CPUs' and refers to the 'NVIDIA apex library' and 'CUDA multi-streams', but it does not provide specific model numbers for GPUs or CPUs, or detailed hardware specifications used for running experiments. |
| Software Dependencies | No | The paper mentions using 'NVIDIA apex library', 'ADAM optimizer', and 'Torch Script' for efficient implementation, but it does not specify any version numbers for these software components. |
| Experiment Setup | Yes | Hyper-parameter selection. We validate the hyper-parameters of our Conv-TT-LSTM on though a wide grid search on the validation set. Specifically, we consider a base filter size S = 3, 5, order of the decomposition N = 1, 2, 3, 5, tensor ranks C(i) = 4, 8, 16, and number of hidden states M = 1, 3, 5. Appendix B contains the details of our hyper-parameter search. |