Stochastic Video Generation with a Learned Prior

Authors: Emily Denton, Rob Fergus

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our SVG-FP and SVG-LP model on one synthetic video dataset (Stochastic Moving MNIST) and two real ones (KTH actions (Schuldt et al., 2004) and BAIR robot (Ebert et al., 2017)). We show quantitative comparisons by computing structural similarity (SSIM) and Peak Signal-to-Noise Ratio (PSNR) scores between ground truth and generated video sequences.
Researcher Affiliation Collaboration Emily Denton 1 Rob Fergus 1 2 1New York University 2Facebook AI Research.
Pseudocode Yes For a time step t during training, the generation is as follows, where the LSTM recurrence is omitted for brevity: µφ(t), σφ(t) = LSTMφ(ht) ht = Enc(xt) zt N(µφ(t), σφ(t)) gt = LSTMθ(ht 1, zt) ht 1 = Enc(xt 1) µθ(t) = Dec(gt)
Open Source Code Yes Source code and trained models are available at https://github.com/ edenton/svg.
Open Datasets Yes We evaluate our SVG-FP and SVG-LP model on one synthetic video dataset (Stochastic Moving MNIST) and two real ones (KTH actions (Schuldt et al., 2004) and BAIR robot (Ebert et al., 2017)).
Dataset Splits No The paper mentions training on datasets and evaluating on 'unseen test videos' and 'held out test sequences', but does not specify a separate validation dataset split or its size/proportion for reproduction.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running experiments.
Software Dependencies No The paper mentions the use of the ADAM optimizer and various network architectures (DCGAN, VGG16), but does not provide specific software version numbers for libraries or frameworks used (e.g., TensorFlow, PyTorch versions).
Experiment Setup Yes We train all the models with the ADAM optimizer (Kingma & Ba, 2014) and learning rate η = 0.002. We set β = 1e-4 for KTH and BAIR and β = 1e-6 for KTH.