Decomposing Motion and Content for Natural Video Sequence Prediction

Authors: Ruben Villegas, Jimei Yang, Seunghoon Hong, Xunyu Lin, Honglak Lee

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed network architecture on human activity videos using KTH, Weizmann action, and UCF-101 datasets. We show state-of-the-art performance in comparison to recent approaches. To the best of our knowledge, this is the first end-to-end trainable network architecture with motion and content separation to model the spatio-temporal dynamics for pixel-level future prediction in natural videos.
Researcher Affiliation Collaboration Ruben Villegas1 Jimei Yang2 Seunghoon Hong3, Xunyu Lin4,* Honglak Lee1,5 1University of Michigan, Ann Arbor, USA 2Adobe Research, San Jose, CA 95110 3POSTECH, Pohang, Korea 4Beihang University, Beijing, China 5Google Brain, Mountain View, CA 94043
Pseudocode No The paper describes the model architecture and algorithm steps in detail with text and diagrams, but does not provide formal pseudocode blocks or algorithms.
Open Source Code No The paper mentions a project website ('https://sites.google. com/a/umich.edu/rubenevillegas/iclr2017') for qualitative comparisons and videos, but does not explicitly state that source code for the methodology is available there or elsewhere.
Open Datasets Yes We evaluate the proposed network architecture on human activity videos using KTH (Schuldt et al., 2004), Weizmann action (Gorelick et al., 2007), and UCF-101 (Soomro et al., 2012) datasets. ... all networks were trained on Sports-1M (Karpathy et al., 2014) dataset and tested on UCF-101 unless otherwise stated.
Dataset Splits No The paper specifies training and testing splits (e.g., 'person 1-16 for training and 17-25 for testing' for KTH, or 'trained on Sports-1M... and tested on UCF-101'), but does not explicitly mention a distinct validation set or how it was used.
Hardware Specification Yes We also thank NVIDIA for donating K40c and TITAN X GPUs.
Software Dependencies No The paper mentions using CNNs and LSTMs but does not specify any software libraries, frameworks, or their version numbers (e.g., TensorFlow, PyTorch, Python version).
Experiment Setup Yes For all our experiments, we use α = 1, λ = 1, and p = 2 in the loss functions. ... We set β = 0.02 for training. ... We set β = 0.001 for training.