DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models
Authors: Weijia Wu, Yuzhong Zhao, Hao Chen, Yuchao Gu, Rui Zhao, Yefei He, Hong Zhou, Mike Zheng Shou, Chunhua Shen
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To showcase the power of the proposed approach, we generate datasets with rich dense pixel-wise labels for a wide range of downstream tasks, including semantic segmentation, instance segmentation, and depth estimation. Notably, it achieves (1) state-of-the-art results on semantic segmentation and instance segmentation; (2) significantly more robust on domain generalization than using the real data alone; and state-of-the-art results in zero-shot segmentation setting; and (3) flexibility for efficient application and novel task composition (e.g., image editing). |
| Researcher Affiliation | Collaboration | 1Zhejiang University, China 2University of Chinese Academy of Sciences, China 3Show Lab, National University of Singapore 4Ant Group |
| Pseudocode | No | The paper includes figures illustrating the framework and decoder architecture, but it does not contain a pseudocode block or an explicitly labeled algorithm. |
| Open Source Code | Yes | The project website is at: weijiawu.github.io/Dataset DM. |
| Open Datasets | Yes | Semantic Segmentation. Pascal-VOC 2012 [15] (20 classes) and Cityscapes [11] (19 classes), as two classical benchmark are used to evaluate. ... Instance Segmentation. For the COCO2017 [33] benchmark... Depth Estimation. We synthesized a total of 80k synthetic images for NYU Depth V2 [46]. ... Pose Estimation. We generated a set of 30k synthetic images for COCO2017 Pose dataset [33]... |
| Dataset Splits | No | The paper discusses training and testing, and mentions using 'COCO val2017' in tables. However, it does not provide specific details on the dataset splits for validation, such as percentages or sample counts for training/validation/test sets. |
| Hardware Specification | Yes | For all tasks, we train Dataset DM for around 50k iterations with images of size 512 512, which only need one Tesla V100 GPU, and lasted for approximately 20 hours. |
| Software Dependencies | No | The paper mentions 'Stable diffusion V1 [41] model' and 'Mask2Former [8]' as architectures, and 'Optimizer [36]' as a reference, but it does not specify software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | For all tasks, we train Dataset DM for around 50k iterations with images of size 512 512... Optimizer [36] with a learning rate of 0.0001 is used. |