Distribution-Aware Data Expansion with Diffusion Models

Authors: Haowei Zhu, Ling Yang, Jun-Hai Yong, Hongzhi Yin, Jiawei Jiang, Meng Xiao, Wentao Zhang, Bin Wang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We assess the performance of Dist Diff across six image classification datasets, encompassing diverse tasks such as general object classification (Caltech-101 [16], CIFAR100-Subset [31], Image Nette [27]), fine-grained classification (Cars [30]), textual classification (DTD [8]) and medical imaging (Path MNIST [68]). More details are provided in Appendix B.1.
Researcher Affiliation Academia 1Tsinghua University 2Peking University 3University of Queensland 4 Wuhan University 5 CNIC, CAS
Pseudocode Yes We present the pseudocode for our algorithm in Algorithm 1, illustrating the hierarchical energy guidance within the diffusion sampling process.
Open Source Code Yes Our code is available at https://github.com/haoweiz3/Dist Diff.
Open Datasets Yes Datasets We assess the performance of Dist Diff across six image classification datasets, encompassing diverse tasks such as general object classification (Caltech-101 [16], CIFAR100-Subset [31], Image Nette [27]), fine-grained classification (Cars [30]), textual classification (DTD [8]) and medical imaging (Path MNIST [68]).
Dataset Splits Yes Table 8 provides the detailed statistics of six experimental datasets, including Caltech-101 [16], CIFAR100-Subset [31], Standard Cars [30], Image Nette [28], DTD [8], and Path MNIST[68]. NAME CLASSES SIZE (TRAIN / TEST) DESCRIPTION CALTECH-101 100 3000 / 6085 CIFAR100-SUBSET 100 10000 / 10000 STANDARDCARS 196 8144 / 8041 IMAGENETTE 10 9469 / 3925 DTD 47 3760 / 1880 PATHMNIST 9 900 / 7180
Hardware Specification Yes The generation, training, and evaluation processes are conducted on a single Ge Force RTX 3090 GPU.
Software Dependencies No The paper mentions 'Py Torch framework with Python 3.10.6' and 'diffuser [62]', but it does not provide specific version numbers for PyTorch or Diffusers, which are key software components. The question requires specific version numbers for all key software components.
Experiment Setup Yes In our experimental setup, we implement Dist Diff based on Stable Diffusion 1.4 [50]. The images created by Stable Diffusion have a resolution of 512 × 512 for all datasets. Throughout the diffusion process, we employ the DDIM [60] sampler for a 50-step latent diffusion, with hyper-parameters for noise strength set at 0.5 and classifier free guidance scale at 7.5. The ϵ in Equation 3 is 0.2 by default. We use a Res Net-50 [21] model trained from scratch on the original datasets as our guidance model. We assign K = 3 to each class when constructing group-level prototypes, the learning rate ρ is 10.0, and optimization step M is set to 20 unless specified otherwise. After expansion, we concatenate the original dataset with synthetic data to create expanded datasets. We then train the classification model from random initialization for 100 epochs using these expanded datasets. During model training, we process images through random cropping to 224 × 224 using random rotation and random horizontal flips. Our optimization strategy involves using the SGD optimizer with a momentum of 0.9, and cosine decay with an initial learning rate of 0.1.