reproducibilityindex.ai

PixelTransformer: Sample Conditioned Signal Generation

Authors: Shubham Tulsiani, Abhinav Gupta

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically validate our approach across three image datasets and show that we learn to generate diverse and meaningful samples, with the distribution variance reducing given more observed pixels. We also show that our approach is applicable beyond images and can allow generating other types of spatial outputs e.g. polynomials, 3D shapes, and videos.
Researcher Affiliation	Collaboration	1Facebook AI Research 2Carnegie Mellon University.
Pseudocode	No	The paper describes computational steps but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper provides a project page URL (https://shubhtuls.github.io/PixelTransformer/) but no explicit statement or direct link to a code repository for the methodology described in the paper.
Open Datasets	Yes	We examine our approach on three different image datasets CIFAR10 (Krizhevsky, 2009), MNIST (Le Cun et al., 1998), and the Cat Faces (Wu et al., 2020) dataset while using the standard image splits.
Dataset Splits	Yes	We examine our approach on three different image datasets CIFAR10 (Krizhevsky, 2009), MNIST (Le Cun et al., 1998), and the Cat Faces (Wu et al., 2020) dataset while using the standard image splits.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments.
Software Dependencies	No	The paper does not explicitly list software dependencies with specific version numbers (e.g., Python 3.x, PyTorch 1.x, CUDA 11.x) that were used for the implementation or experiments.
Experiment Setup	Yes	We vary the number of observed pixels S randomly between 4 and 2048 (with uniform sampling in log-scale), while the number of query samples Q is set to 2048. During training, the locations x are treated as varying over a continuous domain, using bilinear sampling to obtain the corresponding value this helps our implementation be agnostic to the image resolution in the dataset. While we train a separate network fθ for each dataset, we use the exact same model, hyper-parameters etc. across them.