Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Don't Play Favorites: Minority Guidance for Diffusion Models
Authors: Soobin Um, Suhyeon Lee, Jong Chul Ye
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on benchmark real datasets demonstrate that our minority guidance can greatly improve the capability of generating high-quality minority samples over existing generative samplers. We showcase that the performance benefit of our framework persists even in demanding real-world scenarios such as medical imaging, further underscoring the practical significance of our work. |
| Researcher Affiliation | Academia | Soobin Um, Suhyeon Lee & Jong Chul Ye KAIST, Daejeon, Republic of Korea EMAIL |
| Pseudocode | No | The paper describes its methods through prose and mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/soobin-um/minority-guidance. |
| Open Datasets | Yes | Datasets. Our experiments are conducted on six real benchmarks: four unconditional and two class-conditional datasets. For the unconditional settings, we employ Celeb A 64^2 (Liu et al., 2015), CIFAR-10 (Krizhevsky et al., 2009), and LSUN-Bedrooms 256^2 (Yu et al., 2015)... We use Image Net 64^2 and 256^2 (Deng et al., 2009)... |
| Dataset Splits | No | The paper mentions using specific subsets of real data for evaluating generated samples (e.g., "10K and 5K real samples yielding the highest Avgk NN values for Celeb A and CIFAR-10"), but these are for evaluation metrics, not explicit validation dataset splits for model training or tuning. There is no explicit description of a validation split for the main models used. |
| Hardware Specification | Yes | All these results were obtained using a single A100 GPU. |
| Software Dependencies | No | Our implementation is based on Py Torch (Paszke et al., 2019). While PyTorch is mentioned, a specific version number is not provided, nor are explicit versions for other libraries or dependencies. |
| Experiment Setup | Yes | For the number of minority classes L, we take L = 100 for the three unconditional natural image datasets and L = 25 for Image Net and CIFAR-10-LT. We use L = 50 for the brain MRI experiments... We swept over {1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 5.0, 6.0, . . . , 10.0} for the classifier scale w. We employ 250 timesteps to sample from the baseline DDPM... See Table 2 for explicit details. |