Generalizable Features From Unsupervised Learning

Authors: Mehdi Mirza, Aaron Courville, Yoshua Bengio

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that an unsupervised model, trained to predict future frames of a video sequence of stable and unstable block configurations, can yield features that support extrapolating stability prediction to blocks configurations outside the training set distribution. Figure 4 shows the classification results for each of the 9 models described in Section 4.2 tested on 3, 4 and 5 blocks.
Researcher Affiliation Academia Mehdi Mirza & Aaron Courville & Yoshua Bengio MILA Universit e de Montr eal {memirzamo, aaron.courville, yoshua.umontreal}@gmail.com
Pseudocode No No structured pseudocode or algorithm blocks are present in the paper. Tables 1 and 2 describe model architectures but are not pseudocode.
Open Source Code No The paper does not provide an explicit statement or a link to open-source code for the methodology described. Footnote 1 links to an external physics engine, not the authors' implementation.
Open Datasets No We, therefore, construct a new dataset, with a similar setup as Lerer et al. (2016); Zhang et al. (2016), that includes this video sequence. We use a Javascript based physics engine1 to generate the data. The paper describes generating its own dataset but does not provide access information or state its public availability.
Dataset Splits Yes For each tower height, we create 8000, 1000 and 3000 video clips for the training, validation, and test set, respectively.
Hardware Specification No The paper does not provide any specific hardware details used for running experiments.
Software Dependencies No The paper mentions specific algorithms and architectures like 'Ada M Optimizer' and 'Res Net', and a 'Javascript based physics engine', but does not provide version numbers for any software dependencies.
Experiment Setup Yes All activation functions are Re LU(Nair & Hinton, 2010). The objective function is the mean squared error between the generated last frame and the ground-truth frame; as a result, this training will not require any labels. Mean squared error is minimized using the Ada M Optimizer(Kingma & Ba, 2014) and we use earlystopping when the validation loss does not improve for 100 epochs. For training, we further subsample in time dimension and reduce the sequence length to 5-time steps. All images are contrast normalized independently and we augment our training set using random horizontal flip of the images and randomly changing the contrast and brightness.