Diverse Video Generation using a Gaussian Process Trigger
Authors: Gaurav Shrivastava, Abhinav Shrivastava
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We achieve state-of-the-art results on diverse future frame generation in terms of reconstruction quality and diversity of the generated sequences. Webpage http://www.cs.umd.edu/ gauravsh/dvg.html |
| Researcher Affiliation | Academia | Gaurav Shrivastava and Abhinav Shrivastava University of Maryland, College Park {gauravsh,abhinav}@cs.umd.edu |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper provides a webpage URL (http://www.cs.umd.edu/~gauravsh/dvg.html) in the abstract, but does not explicitly state that this page contains the source code for the methodology or provide a direct link to a code repository. |
| Open Datasets | Yes | KTH Action Recognition Dataset. The KTH action dataset (Schuldt et al., 2004) [...]; BAIR pushing Dataset. The BAIR robot pushing dataset (Ebert et al., 2017) [...]; Human3.6M Dataset. Human3.6M (Ionescu et al., 2014) [...]; UCF Dataset. This dataset (Soomro et al., 2012) [...] |
| Dataset Splits | No | The paper states, "All models use 5 frames as context (past) during training and learn to predict the next 10 frames." and describes evaluation procedures like using "500 starting sequences" and "50 future sequences", but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or exact counts for each split) from the overall datasets. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models. |
| Software Dependencies | No | The paper mentions using "GPytorch" and "Adam optimizer" and "I3D action recognition classifier" but does not specify their version numbers, which is required for reproducible software dependencies. |
| Experiment Setup | Yes | All our models are trained using Adam optimizer. All models use 5 frames as context (past) during training and learn to predict the next 10 frames. [...] For the deterministic switch, we do not use the variance of the GP as a trigger, and switch every 15 frames. [...] For the GP trigger switch, we compare the current GP variance with the mean of the variance of the last 10 states. If the current variance is larger than two standard deviations, we trigger a switch. [...] We trained all models on 64 × 64-size frames from the KTH, Human3.6M, and BAIR datasets. [...] For variational GP implementation, 40 inducing points were randomly initialized and learned during the training of GP. |