reproducibilityindex.ai

Classification Diffusion Models: Revitalizing Density Ratio Estimation

Authors: Shahar Yadin, Noam Elata, Tomer Michaeli

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our method is the first DRE-based technique that can successfully generate images beyond the MNIST dataset. Furthermore, it can output the likelihood of any input in a single forward pass, achieving state-of-the-art negative log likelihood (NLL) among methods with this property... Our experiments shed light on the reasons why DRE methods have failed on complex high-dimensional data to date, and why CDM inherently avoids these challenges.
Researcher Affiliation	Academia	Shahar Yadin Noam Elata Tomer Michaeli Faculty of Electrical and Computer Engineering Technion Israel Institute of Technology {shahar.yadin@campus,noamelata@campus,tomer.m@ee}.technion.ac.il
Pseudocode	Yes	Algorithm 1 CDM Training", "Algorithm 2 DDPM Sampling Using CDM
Open Source Code	Yes	Code is available on the project s webpage.
Open Datasets	Yes	We train several CDMs on two common datasets. For CIFAR-10 [26] we train both a class conditional model and an unconditional model. We also train a similar model for Celeb A [31], using face images of size 64 64.
Dataset Splits	No	The paper mentions using common datasets like CIFAR-10 and Celeb A and evaluating on the test set, but it does not provide specific details on the train/validation/test splits (e.g., percentages, sample counts, or explicit splitting methodology) within the paper.
Hardware Specification	Yes	Training the model on Celeb A 64 64 takes 108 hours on a server of 4 NVIDIA RTX A6000 48GB GPUs... Training the model on CIFAR-10 takes 35 hours on a server of 4 NVIDIA RTX A6000 48GB GPUs.
Software Dependencies	No	The paper mentions using a 'pytorch diffusion repository' but does not specify the version of PyTorch or any other software dependencies with their version numbers.
Experiment Setup	Yes	We trained the model for 500k iterations with a learning rate of 1 10 4. We started with a linear warmup of 5k iterations and reduced the learning rate by a factor of 10 after every 200k iterations. The typical value of the CE loss after convergence was 3.8 while the MSE loss was 0.0134 so we chose to give the CE loss a weight of 0.001 to ensure the values of both losses have the same order of magnitude. In addition We used EMA with a factor of 0.9999, as done in the baseline model.