reproducibilityindex.ai

PixelCNN Models with Auxiliary Variables for Natural Image Modeling

Authors: Alexander Kolesnikov, Christoph H. Lampert

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally demonstrate beneﬁts of the proposed models, in particular showing that they produce much more realistically looking image samples than previous state-of-the-art probabilistic models. 4. Experiments In this section we experimentally study the proposed Grayscale Pixel CNN and Pyramid Pixel CNN models on natural image modeling task and report quantitative and qualitative evaluation results.
Researcher Affiliation	Academia	Alexander Kolesnikov 1 Christoph H. Lampert 1 1IST Austria, Klosterneuburg, Austria.
Pseudocode	No	No structured pseudocode or algorithm blocks were found.
Open Source Code	No	No concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper was found.
Open Datasets	Yes	We evaluate the modeling performance of a Grayscale Pixel CNN on the CIFAR-10 dataset (Krizhevsky & Hinton, 2009). We rely on the aligned&cropped Celeb A dataset (Liu et al., 2015) that contains approximately 200,000 images of size 218x178.
Dataset Splits	No	The paper specifies training and test splits (e.g., 'training set with 50,000 images and a test set with 10,000 images' for CIFAR-10, and 'random 95% subset of all images as training set and the remaining images as a test set' for Celeb A), but does not explicitly provide details for a separate validation dataset split.
Hardware Specification	Yes	Concretely, on an NVidia Titan X GPU, our Pyramid Pixel CNN without caching optimizations requires approximately 0.004 seconds on average to generate one image pixel, while a Pixel CNN++ even with recently suggested caching optimizations requires roughly 0.05 seconds for the same task.
Software Dependencies	No	The paper mentions using 'Adam (Kingma & Ba, 2014)' for optimization and 'Pixel CNN++ architecture' but does not provide specific version numbers for any software dependencies like programming languages or libraries.
Experiment Setup	Yes	In the Adam optimizer we use an initial learning rate of 0.001, a batch size of 64 images and an exponential learning rate decay of 0.99999 that is applied after each iteration. We train the grayscale model pˆθ( b X) for 30 epochs and the conditional model pθ(X\| b X) for 200 epochs. In the Adam optimizer we use an initial learning rate 0.001, a batch size of 16 and a learning rate decay of 0.999995. We train the model for 60 epochs. For the embedding fw( b X) we use a Pixel CNN++ architecture with 15 residual blocks with downsampling layer after the residual block number 3 and upsampling layers after the residual blocks number 9 and 12. For all convolutional layers we set the number of ﬁlters to 100.