reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Inherent Privacy Properties of Discrete Denoising Diffusion Models

Authors: Rongzhe Wei, Eleonora Kreacic, Haoyu Peter Wang, Haoteng Yin, Eli Chien, Vamsi K. Potluru, Pan Li

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we empirically verify our theoretical findings on both synthetic and real-world datasets.
Researcher Affiliation	Collaboration	Rongzhe Wei EMAIL ECE, Georgia Institute of Technology Eleonora Kreačić EMAIL J.P. Morgan AI Research Haoyu Wang EMAIL ECE, Georgia Institute of Technology Haoteng Yin EMAIL CS, Purude University Eli Chien EMAIL ECE, Georgia Institute of Technology Vamsi K. Potluru EMAIL J.P. Morgan AI Research Pan Li EMAIL ECE, Georgia Institute of Technology
Pseudocode	Yes	Algorithm 1 Privacy Bound for Discrete Diffusion Models Algorithm 2 Algc t : Finding c t
Open Source Code	No	The paper does not contain an explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets	Yes	We evaluate our algorithm on three benchmark datasets: Adult (Kohavi et al., 1996), German Credit (Hofmann, 1994), and Loan (Its Suru)
Dataset Splits	Yes	To evaluate the efficacy of DDMs, we divided the original dataset into three segments: training, validation, and testing, adhering to an 8:1:1 ratio.
Hardware Specification	Yes	Experiments were performed on a server with four Intel 24-Core Gold 6248R CPUs, 1TB DRAM, and eight NVIDIA QUADRO RTX 6000 (24GB) GPUs.
Software Dependencies	No	The paper mentions software components like 'Adam optimizer' and 'leaky ReLU' but does not provide specific version numbers for any libraries, frameworks, or programming languages used.
Experiment Setup	Yes	We designed a four-layer MLP equipped with 256 hidden neurons. ... The denoising network underwent training via the Adam optimizer, set with a learning rate of 1e-3 and a weight decay of 5e-4. Our approach involved uniformly sampling the diffusion step and drawing batches of 30 samples each. The training spanned 100 epochs, focusing on minimizing the binary cross-entropy loss.