Dataset Diffusion: Diffusion-based Synthetic Data Generation for Pixel-Level Semantic Segmentation

Authors: Quang Nguyen, Truong Vu, Anh Tran, Khoi Nguyen

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct evaluations on two datasets, PASCAL VOC and MSCOCO, and our approach significantly outperforms concurrent work. To evaluate the quality of the synthesized datasets, we introduce two benchmark datasets: synth-VOC and synth-COCO. We conduct all ablation study experiments on the text prompts described in Sec. 3.1.
Researcher Affiliation Collaboration Quang Nguyen1,2 Truong Vu1 Anh Tran1 Khoi Nguyen1 1Vin AI Research, 2Ho Chi Minh City University of Technology, VNU-HCM
Pseudocode No The paper describes procedures in text and uses equations, but there are no explicitly labeled "Pseudocode" or "Algorithm" blocks or figures.
Open Source Code Yes Our benchmarks and code will be released at https://github.com/Vin AIResearch/Dataset-Diffusion.
Open Datasets Yes We evaluate our Dataset Diffusion on two datasets: PASCAL VOC 2012 [10] and COCO 2017 [11].
Dataset Splits Yes The PASCAL VOC 2012 dataset ... to have a total of 12, 046 training, 1, 449 validation, and 1, 456 test images. The COCO 2017 dataset contains 80 object classes and 1 background class with 118, 288 training and 5K validation images, along with provided captions for each image.
Hardware Specification Yes We conduct our experiments on NVIDIA A100 40G GPUs.
Software Dependencies Yes We build our framework on Py Torch deep learning framework [52] and Stable Diffusion [5] version 2.1-base with T = 100 timesteps. Regarding semantic segmenter, we employ the Deep Lab V3 [15] and Mask2Former [24] segmenter implemented in the MMSegmentation framework [53].
Experiment Setup Yes We construct the masks using optimal values for τ, α, and β, which are defined in Sec. 6.2. We use the Adam W optimizer with a learning rate of 1e 4 and weight decay of 1e 4. For other hyper-parameters, we follow standard settings in MMSegmentation.