PixelNN: Example-based Image Synthesis

Authors: Aayush Bansal, Yaser Sheikh, Deva Ramanan

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now present our findings for multiple modalities such as a low-resolution image (12 12 image), a surface normal map, and edges/boundaries for domains such as human faces, cats, dogs, handbags, and shoes. We compare our approach both quantitatively and qualitatively with the recent work of Isola et al. (2016) that use generative adversarial networks for pixel-to-pixel translation.
Researcher Affiliation Academia Aayush Bansal Yaser Sheikh Deva Ramanan Carnegie Mellon University {aayushb,yaser,deva}@cs.cmu.edu
Pseudocode No The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured code-like steps for its method.
Open Source Code No The paper links to 'scanner' (https://github.com/scanner-research/scanner) as a system-level optimization that 'may potentially be useful', but it is not presented as the open-source code for the methodology described in this paper.
Open Datasets Yes We use 100, 000 images from the training set of CUHK Celeb A dataset (Liu et al., 2015) to train a regression model and do NN. We use 3, 686 images of cats and dogs from the Oxford-IIIT Pet dataset (Parkhi et al., 2012). 50, 000 training images of shoes were used from (Yu & Grauman, 2014), and 137, 000 images of Amazon handbags from (Zhu et al., 2016).
Dataset Splits Yes We used 100, 000 images from the training set of CUHK Celeb A dataset (Liu et al., 2015) to train a regression model and do NN. We used the subset of test images to evaluate our approach. Of these 3, 686 images of cats and dogs from the Oxford-IIIT Pet dataset (Parkhi et al., 2012), 3, 000 images were used for training, and remaining 686 for evaluation.
Hardware Specification No Importantly, we make use of a single CPU to perform our nearest neighbor search, while Isola et al. (2016)makes use of a GPU. The paper does not specify the model or detailed specifications of the CPU or GPU used.
Software Dependencies No The paper mentions using 'Pixel Net' and 'HED' but does not provide specific version numbers for these or any other software dependencies, libraries, or frameworks used.
Experiment Setup Yes In practice, we vary K from {1, 2, .., 10} and T from {1, 3, 5, 10, 96} and generate 72 candidate outputs for a given input.