Learning What and Where to Draw
Authors: Scott E. Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show high-quality 128 128 image synthesis on the Caltech-UCSD Birds dataset, conditioned on both informal text descriptions and also object location. In this section we describe our experiments on generating images from text descriptions on the Caltech-UCSD Birds (CUB) and MPII Human Pose (MHP) datasets. |
| Researcher Affiliation | Collaboration | Scott Reed1, reedscot@google.com Zeynep Akata2 akata@mpi-inf.mpg.de Santosh Mohan1 santoshm@umich.edu Samuel Tenka1 samtenka@umich.edu Bernt Schiele2 schiele@mpi-inf.mpg.de Honglak Lee1 honglak@umich.edu 1University of Michigan, Ann Arbor, USA 2Max Planck Institute for Informatics, Saarbrücken, Germany. Majority of this work was done while first author was at U. Michigan, but completed while at Deep Mind. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It describes the model architecture and mathematical formulations, but no explicit pseudocode. |
| Open Source Code | No | The paper does not provide an explicit statement or link to the open-source code for the GAWWN methodology described. It mentions using and adapting third-party code (dcgan.torch, spatial transformer module by Oquab), but not releasing their specific implementation. |
| Open Datasets | Yes | CUB [Wah et al., 2011] has 11,788 images of birds belonging to one of 200 different species. We also use the text dataset from Reed et al. [2016a] including 10 single-sentence descriptions per bird image. MHP Andriluka et al. [2014] has 25K images with 410 different common activities. |
| Dataset Splits | No | The paper does not provide specific dataset split information (e.g., exact percentages, sample counts for training, validation, and test sets). It mentions training models on 'all categories' and showing samples on 'held out captions', implying a train/test split, but without detailed breakdown or a separate validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU types, or memory) used for running its experiments. It only mentions training models. |
| Software Dependencies | No | The paper mentions using a 'Torch implementation provided by Oquab [2016]' and that 'Our GAN implementation is loosely based on dcgan.torch4', but it does not provide specific version numbers for these or other software dependencies (e.g., Python, PyTorch, CUDA versions) required for replication. |
| Experiment Setup | Yes | For both CUB and MHP, we trained our GAWWN using the ADAM solver with batch size 16 and learning rate 0.0002 (See Alg. 1 in [Reed et al., 2016b] for the conditional GAN training algorithm). |