Scene Graph to Image Synthesis via Knowledge Consensus
Authors: Yang Wu, Pengxu Wei, Liang Lin
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate that, even conditioned only on scene graphs, our model surprisingly achieves superior performance on semantics-aware image generation, without losing the competence on manipulating the generation through knowledge graphs. |
| Researcher Affiliation | Academia | Yang Wu1, Pengxu Wei1,*, Liang Lin1,2 1 School of Computer Science and Engineering, Sun Yat-sen University 2 Key Laboratory of Information Security Technology, Guang Dong Province |
| Pseudocode | Yes | Our learning strategy for these three modules are provided in Appendix A, Algorithm 1, where we use C to represent all VAE models for different graph components to avoid restatement. |
| Open Source Code | No | The paper does not provide a direct link to open-source code or explicitly state that the code will be made available. |
| Open Datasets | Yes | We evaluate the proposed KCGM on Visual Genome (VG) dataset (Krishna et al. 2017). |
| Dataset Splits | No | The paper mentions using the Visual Genome dataset and a batch size of 72, but does not specify the train/validation/test splits (e.g., percentages or counts) or a cross-validation setup. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments (e.g., GPU models, CPU types, or memory). |
| Software Dependencies | No | Adam (Kingma and Ba 2015) is applied as the optimizer to learn the overall framework. However, no specific version numbers for Adam or other software dependencies are provided. |
| Experiment Setup | Yes | The batch size during model optimization is set to be 72 for 128 128 generated image size. Each component has different learning rates and other hyperparameters. The GAN generator is updated every 4 iterations after the discriminator is updated. We pre-train the GID (including a Ge VAE and β-VAEs) before jointly learning with the generative module. |