Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Training Unbiased Diffusion Models From Biased Dataset

Authors: Yeongmin Kim, Byeonghu Na, Minsang Park, JoonHo Jang, Dongjun Kim, Wanmo Kang, Il-chul Moon

ICLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental evidence supports the usefulness of the proposed method, which outperforms baselines including time-independent importance reweighting on CIFAR-10, CIFAR-100, FFHQ, and Celeb A with various bias settings.
Researcher Affiliation Collaboration Yeongmin Kim1 , Byeonghu Na1, Minsang Park1, Joon Ho Jang1, Dongjun Kim1, Wanmo Kang1, Il-Chul Moon1,2 (...) 1KAIST, 2Summary.AI
Pseudocode Yes Algorithm 1: Discriminator Training algorithm (...) Algorithm 2: Score Training algorithm with TIW-DSM
Open Source Code Yes Our code is available at https://github.com/alsdudrla10/TIW-DSM.
Open Datasets Yes We consider CIFAR-10, CIFAR-100, FFHQ, and Celeb A datasets, which are commonly used for generative learning.
Dataset Splits No The paper does not explicitly provide the specific percentages or counts for training/validation/test dataset splits from the observed dataset (Dbias) to reproduce the experiment's data partitioning.
Hardware Specification Yes Table 7 shows the computational costs measured using RTX 4090 4 cores in the CIFAR-10 experiments.
Software Dependencies No The paper mentions using PyTorch and following procedures from EDM, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes Table 6: Training and sampling configurations.