Object-Centric Image Generation from Layouts

Authors: Tristan Sylvain, Pengchuan Zhang, Yoshua Bengio, R Devon Hjelm, Shikhar Sharma2647-2655

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive quantitative evaluation and ablation studies demonstrate the impact of our contributions, with our model outperforming previous state-of-the-art approaches on both the COCO-Stuff and Visual Genome datasets.
Researcher Affiliation Collaboration 1Mila, Montr eal, Canada 2Universit e de Montr eal, Montr eal, Canada 3Microsoft Research 4CIFAR Senior Fellow 5Microsoft Turing {tristan.sylvain, yoshua.bengio}@mila.quebec, {penzhan, devon.hjelm, shikhar.sharma}@microsoft.com
Pseudocode No The paper describes the architecture and loss functions but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper states 'Our code is written in Py Torch (Paszke et al. 2019)' but does not provide a link or explicit statement about the public release of their own source code. The arxiv link provided is for the paper itself, not a code repository.
Open Datasets Yes We run experiments on the COCO-Stuff (Caesar, Uijlings, and Ferrari 2018) and Visual Genome (VG) (Krishna et al. 2017) datasets which have been the popular choice for layout-and scene-to-image tasks as they provide diverse and high-quality annotations.
Dataset Splits Yes We apply the same pre-processing and use the same splits as Johnson, Gupta, and Fei-Fei (2018); Zhao et al. (2019)... 128 128 models and above were trained for up to 300 000 iterations, 64 64 models were trained for up to 200 000 iterations (early stopping on a validation set).
Hardware Specification Yes Each experiment ran on 4 V100 GPUs in parallel.
Software Dependencies No The paper states 'Our code is written in Py Torch (Paszke et al. 2019)' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup Yes We used the Adam (Kingma and Ba 2015) solver, with β1 = 0.5, β2 = 0.999. The global learning rate for both generator and discriminators is 0.0001. 128 128 models and above were trained for up to 300 000 iterations, 64 64 models were trained for up to 200 000 iterations (early stopping on a validation set). The SGSM module is trained separately for 200 epochs.