Learning to See by Looking at Noise

Authors: Manel Baradad Jurjo, Jonas Wulff, Tongzhou Wang, Phillip Isola, Antonio Torralba

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate performance using Imagenet-100 [60] and the Visual Task Adaptation Benchmark [61].Figures 3 and 4 show the performance for the proposed fully generative methods from noise on Imagenet100 and VTAB (Tables can be found in the Sup.Mat.).
Researcher Affiliation Academia Manel Baradad MIT CSAIL mbaradad@mit.eduJonas Wulff MIT CSAIL jwulff@csail.mit.eduTongzhou Wang MIT CSAIL tongzhou@mit.eduPhillip Isola MIT CSAIL phillipi@mit.eduAntonio Torralba MIT CSAIL torralba@mit.edu
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide a statement about releasing open-source code or a link to a code repository for their methodology.
Open Datasets Yes We evaluate performance using Imagenet-100 [60] and the Visual Task Adaptation Benchmark [61].As an upper-bound for the maximum expected performance with synthetic images, we consider the same training procedure but using the following real datasets: 1) Places365 [62] ... 2) STL-10 [63] ... 3) Imagenet1k [1]
Dataset Splits Yes For each of the datasets in VTAB, we fix the number of training and validation samples to 20k at random for the datasets where there are more samples available.
Hardware Specification No The paper mentions 'computation resources from the Satori cluster donated by IBM to MIT' but does not provide specific hardware details like GPU/CPU models or memory.
Software Dependencies No The paper mentions models like 'AlexNet-based encoder', 'MoCo v2', and 'StyleGANv2', but does not list specific software dependencies with version numbers (e.g., 'PyTorch 1.9').
Experiment Setup Yes We generate 105k samples using the proposed image models at 128x128 resolution, which are then downsampled to 96x96 and cropped at random to 64x64 before being fed to the encoder.We fix a common set of hyperparameters for all the methods under test to the values found to perform well by the authors of [58].