Cubic LSTMs for Video Prediction

Authors: Hehe Fan, Linchao Zhu, Yi Yang8263-8270

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment shows that Cubic RNN produces more accurate video predictions than prior methods on both synthetic and real-world datasets. We evaluated the proposed Cubic LSTM unit on three video prediction datasets, Moving-MNIST dataset (Srivastava, Mansimov, and Salakhutdinov 2015), Robotic Pushing dataset (Finn, Goodfellow, and Levine 2016) and KTH Action dataset (Sch uldt, Laptev, and Caputo 2004), including synthetic and real-world video sequences.
Researcher Affiliation Academia Hehe Fan, Linchao Zhu, Yi Yang Centre for Artificial Intelligence University of Technology Sydney, Australia {hehe.fan,linchao.zhu}@student.uts.edu.au, yi.yang@uts.edu.au
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access information (e.g., repository links or explicit statements) for the source code of the methodology described.
Open Datasets Yes We evaluated the proposed Cubic LSTM unit on three video prediction datasets, Moving-MNIST dataset (Srivastava, Mansimov, and Salakhutdinov 2015), Robotic Pushing dataset (Finn, Goodfellow, and Levine 2016) and KTH Action dataset (Sch uldt, Laptev, and Caputo 2004), including synthetic and real-world video sequences.
Dataset Splits No The paper mentions generating training samples on-the-fly and specific test settings for Moving-MNIST, and refers to training and two test sets for Robotic Pushing, but it does not provide specific percentages or counts for training, validation, and test splits for the datasets used.
Hardware Specification No The paper states 'We trained the models using eight GPUs in parallel and set the batch size to four for each GPU,' but does not specify the exact model or type of GPUs or other hardware components used.
Software Dependencies No The paper mentions implementation in 'Tensor Flow' but does not specify its version or other software dependencies with version numbers.
Experiment Setup Yes Each state in the implementation of Cubic LSTM has 32 channels. The size of the spatial-convolutional kernel was set to 5 5. Both the temporal-convolutional kernel and output-convolutional kernel were set to 1 1. ... All models were trained for 300K iterations with a learning rate of 10^-3 for the first 150K iterations and a learning rate of 10^-4 for the latter 150K iterations. ... all models were given three context frames before predicting 10 future frames and were trained for 100K iterations with the mean square error loss and the learning rate of 10^-3.