Scene Graph to Image Synthesis via Knowledge Consensus

Authors: Yang Wu, Pengxu Wei, Liang Lin

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results demonstrate that, even conditioned only on scene graphs, our model surprisingly achieves superior performance on semantics-aware image generation, without losing the competence on manipulating the generation through knowledge graphs.
Researcher Affiliation Academia Yang Wu1, Pengxu Wei1,*, Liang Lin1,2 1 School of Computer Science and Engineering, Sun Yat-sen University 2 Key Laboratory of Information Security Technology, Guang Dong Province
Pseudocode Yes Our learning strategy for these three modules are provided in Appendix A, Algorithm 1, where we use C to represent all VAE models for different graph components to avoid restatement.
Open Source Code No The paper does not provide a direct link to open-source code or explicitly state that the code will be made available.
Open Datasets Yes We evaluate the proposed KCGM on Visual Genome (VG) dataset (Krishna et al. 2017).
Dataset Splits No The paper mentions using the Visual Genome dataset and a batch size of 72, but does not specify the train/validation/test splits (e.g., percentages or counts) or a cross-validation setup.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments (e.g., GPU models, CPU types, or memory).
Software Dependencies No Adam (Kingma and Ba 2015) is applied as the optimizer to learn the overall framework. However, no specific version numbers for Adam or other software dependencies are provided.
Experiment Setup Yes The batch size during model optimization is set to be 72 for 128 128 generated image size. Each component has different learning rates and other hyperparameters. The GAN generator is updated every 4 iterations after the discriminator is updated. We pre-train the GID (including a Ge VAE and β-VAEs) before jointly learning with the generative module.