Generating Multiple Objects at Spatially Distinct Locations

Authors: Tobias Hinz, Stefan Heinrich, Stefan Wermter

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform experiments on the Multi-MNIST, CLEVR, and the more complex MSCOCO data set. Our experiments show that through the use of the object pathway we can control object locations within images and can model complex scenes with multiple objects at various locations.
Researcher Affiliation Academia Tobias Hinz, Stefan Heinrich, Stefan Wermter Knowledge Technology, Department of Informatics, Universit at Hamburg Vogt-Koelln-Str. 30, 22527 Hamburg, Germany
Pseudocode No The paper describes the components of the generator and discriminator in detail using prose and figures, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Code can be found here: https://github.com/tohinz/multiple-objects-gan
Open Datasets Yes We perform experiments on the Multi-MNIST, CLEVR, and the more complex MSCOCO data set. For the evaluation, we aim to study the quality of the generated images with a particular focus on the generalization capabilities and the contribution of specific parts of our model, in both controllable and large-scale cases. Thus, in the following sections, we evaluate our approach on three different data sets: the Multi-MNIST data set, the CLEVR data set, and the MS-COCO data set.
Dataset Splits Yes For our final experiment, we used the MS-COCO data set (Lin et al., 2014) to evaluate our model on natural images of complex scenes. In order to keep our evaluation comparable to previous work, we used the 2014 train/test split consisting of roughly 80,000 training and 40,000 test images and rescaled the images to a resolution of 256 256 px.
Hardware Specification No The acknowledgments mention 'NVIDIA Corporation for their support through the GPU Grant Program', but no specific GPU model, CPU type, or other detailed hardware specifications for the experimental setup are provided.
Software Dependencies No The paper mentions using frameworks like Stack GAN and Attn GAN and specific techniques like Spatial Transformer Networks and YOLOv3, and provides links to their implementations, but it does not specify software versions for any libraries, frameworks, or programming languages (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Appendix A provides detailed implementation details for Multi-MNIST, CLEVR, and MS-COCO experiments, including architectural specifications (e.g., convolutional layer parameters, upsampling blocks), activation functions (e.g., leaky ReLU alpha=0.2), optimizer (Adam), learning rate (0.0002), batch sizes (128, 40), number of training epochs (20-120), and weight initialization (N(0, 0.02)). Table 2 further lists filter sizes, strides, and padding for various layers.