Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
Authors: Wan-Cyuan Fan, Yen-Chun Chen, DongDong Chen, Yu Cheng, Lu Yuan, Yu-Chiang Frank Wang
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments over various unconditioned and conditional image generation tasks, ranging from text-to-image synthesis, layout-to-image, scene-graph-to-image, to label-to-image. More specifically, we achieved state-of-the-art FID scores on five benchmarks, namely layout-to-image on COCO and Open Images, scene-graph-to-image on COCO and Visual Genome, and label-to-image on COCO. |
| Researcher Affiliation | Collaboration | Wan-Cyuan Fan1*, Yen-Chun Chen2 , Dong Dong Chen2, Yu Cheng2, Lu Yuan2, Yu-Chiang Frank Wang1, 3 1National Taiwan University 2Microsoft Corporation 3NVIDIA |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code availability for the described methodology. |
| Open Datasets | Yes | The main tasks we considered are text-to-image generation (T2I) on COCO (Lin et al. 2014), scene-graph-to-image generation (SG2I) on COCO-stuff and Visual Genome (VG) (Krishna et al. 2017), label-to-image generation (Label2I) (Jyothi et al. 2019) on COCO-stuff (Lin et al. 2014), and layout-to-image generation (Layout2I) on COCO-stuff and Open Images (Kuznetsova et al. 2020). |
| Dataset Splits | No | The paper mentions training on 'COCO train2014 split' and conducting experiments on 'validation splits', but does not provide specific percentages or sample counts for training, validation, and test dataset splits for all experiments, nor does it consistently reference predefined splits with explicit citations for all datasets. |
| Hardware Specification | Yes | Note that the experiments are done on validation splits with batch size of 32 using 1 V100. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies such as libraries, frameworks, or programming languages used for the experiments. |
| Experiment Setup | Yes | For LDM scores, T = 250; for Frido, T = 200. (...) G: classifier-free guidance with scale = 2.0. (...) For ablations and hyperparameter tunings, we train for 250K iterations to allow more experiments. |