reproducibilityindex.ai

Generating Videos with Scene Dynamics

Authors: Carl Vondrick, Hamed Pirsiavash, Antonio Torralba

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments suggest this model can generate tiny videos up to a second at full frame rate better than simple baselines, and we show its utility at predicting plausible futures of static images. Moreover, experiments and visualizations show the model internally learns useful features for recognizing actions with minimal supervision...
Researcher Affiliation	Academia	Carl Vondrick MIT vondrick@mit.edu Hamed Pirsiavash UMBC hpirsiav@umbc.edu Antonio Torralba MIT torralba@mit.edu
Pseudocode	No	The paper describes network architectures and learning procedures but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper states 'Our implementation is based off a modiﬁed version of [31] in Torch7.' but does not explicitly provide a link or statement about releasing the source code for their own methodology.
Open Datasets	Yes	We downloaded over two million videos from Flickr [39] by querying for popular Flickr tags as well as querying for common English words. ...Action Classiﬁcation: We evaluated performance on classifying actions on UCF101 [35].
Dataset Splits	No	The paper mentions 'train/test splits' for UCF101 in Figure 4a's caption but does not specify the percentages or sample counts for training, validation, and test sets. It also mentions 'batch normalization' but this is a technique, not a data split.
Hardware Specification	No	The paper mentions 'Training typically took several days on a GPU' and 'NVidia donated GPUs used for this research' but does not specify the model or detailed specifications of the GPUs or other hardware used.
Software Dependencies	No	The paper states 'Our implementation is based off a modiﬁed version of [31] in Torch7.' but does not provide a specific version number for Torch7 or any other software dependencies.
Experiment Setup	Yes	We use the Adam [16] optimizer and a ﬁxed learning rate of 0.0002 and momentum term of 0.5. The latent code has 100 dimensions, which we sample from a normal distribution. We use a batch size of 64. We initialize all weights with zero mean Gaussian noise with standard deviation 0.01. We normalize all videos to be in the range [ 1, 1].