reproducibilityindex.ai

Training Unbiased Diffusion Models From Biased Dataset

Authors: Yeongmin Kim, Byeonghu Na, Minsang Park, JoonHo Jang, Dongjun Kim, Wanmo Kang, Il-chul Moon

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental evidence supports the usefulness of the proposed method, which outperforms baselines including time-independent importance reweighting on CIFAR-10, CIFAR-100, FFHQ, and Celeb A with various bias settings.
Researcher Affiliation	Collaboration	Yeongmin Kim1 , Byeonghu Na1, Minsang Park1, Joon Ho Jang1, Dongjun Kim1, Wanmo Kang1, Il-Chul Moon1,2 (...) 1KAIST, 2Summary.AI
Pseudocode	Yes	Algorithm 1: Discriminator Training algorithm (...) Algorithm 2: Score Training algorithm with TIW-DSM
Open Source Code	Yes	Our code is available at https://github.com/alsdudrla10/TIW-DSM.
Open Datasets	Yes	We consider CIFAR-10, CIFAR-100, FFHQ, and Celeb A datasets, which are commonly used for generative learning.
Dataset Splits	No	The paper does not explicitly provide the specific percentages or counts for training/validation/test dataset splits from the observed dataset (Dbias) to reproduce the experiment's data partitioning.
Hardware Specification	Yes	Table 7 shows the computational costs measured using RTX 4090 4 cores in the CIFAR-10 experiments.
Software Dependencies	No	The paper mentions using PyTorch and following procedures from EDM, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	Table 6: Training and sampling configurations.