Dataset Diffusion: Diffusion-based Synthetic Data Generation for Pixel-Level Semantic Segmentation
Authors: Quang Nguyen, Truong Vu, Anh Tran, Khoi Nguyen
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct evaluations on two datasets, PASCAL VOC and MSCOCO, and our approach significantly outperforms concurrent work. To evaluate the quality of the synthesized datasets, we introduce two benchmark datasets: synth-VOC and synth-COCO. We conduct all ablation study experiments on the text prompts described in Sec. 3.1. |
| Researcher Affiliation | Collaboration | Quang Nguyen1,2 Truong Vu1 Anh Tran1 Khoi Nguyen1 1Vin AI Research, 2Ho Chi Minh City University of Technology, VNU-HCM |
| Pseudocode | No | The paper describes procedures in text and uses equations, but there are no explicitly labeled "Pseudocode" or "Algorithm" blocks or figures. |
| Open Source Code | Yes | Our benchmarks and code will be released at https://github.com/Vin AIResearch/Dataset-Diffusion. |
| Open Datasets | Yes | We evaluate our Dataset Diffusion on two datasets: PASCAL VOC 2012 [10] and COCO 2017 [11]. |
| Dataset Splits | Yes | The PASCAL VOC 2012 dataset ... to have a total of 12, 046 training, 1, 449 validation, and 1, 456 test images. The COCO 2017 dataset contains 80 object classes and 1 background class with 118, 288 training and 5K validation images, along with provided captions for each image. |
| Hardware Specification | Yes | We conduct our experiments on NVIDIA A100 40G GPUs. |
| Software Dependencies | Yes | We build our framework on Py Torch deep learning framework [52] and Stable Diffusion [5] version 2.1-base with T = 100 timesteps. Regarding semantic segmenter, we employ the Deep Lab V3 [15] and Mask2Former [24] segmenter implemented in the MMSegmentation framework [53]. |
| Experiment Setup | Yes | We construct the masks using optimal values for τ, α, and β, which are defined in Sec. 6.2. We use the Adam W optimizer with a learning rate of 1e 4 and weight decay of 1e 4. For other hyper-parameters, we follow standard settings in MMSegmentation. |