Action Prediction From Videos via Memorizing Hard-to-Predict Samples

Authors: Yu Kong, Shangqian Gao, Bin Sun, Yun Fu

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on UCF-101 and Sports-1M datasets show that our method outperforms state-of-the-art methods.
Researcher Affiliation Academia Department of Electrical & Computer Engineering, Northeastern University, Boston, MA, USA College of Engineering, Northeastern University, Boston, MA, USA College of Computer & Information Science, Northeastern University, Boston, MA, USA
Pseudocode No The paper describes the proposed method and its components (mem-LSTM, two-stream network, memory module, bi-directional LSTM) in narrative text and through architectural diagrams, but does not provide any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository.
Open Datasets Yes We use the UCF-101 (Soomro, Zamir, and Shah 2012) and Sports-1M datasets (Karpathy et al. 2014) to evaluate our method.
Dataset Splits No The paper mentions using UCF-101 and Sports-1M datasets but does not explicitly provide the training, validation, and test dataset splits (e.g., percentages, sample counts, or references to predefined splits beyond 'test on part of').
Hardware Specification Yes The entire networks are trained using stochastic gradient descent (SGD) algorithm implemented on a single Titan X GPU.
Software Dependencies No The paper mentions models like Res Net-18 and VGG-19, and the SGD algorithm, but does not provide specific version numbers for any software dependencies or libraries used for implementation (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup Yes The input and output size of LSTM is 512. The bi-directional LSTM model contains one forward LSTM layer and one backward LSTM layer... The input size of LSTM is 4096 and output size of the LSTM are 512. During training and testing, the memory-size is set to 5000, K is set to 16 and the key-size is 1024 (same as the sum of RGB LSTM and Flow LSTM). The entire networks are trained using stochastic gradient descent (SGD) algorithm.