Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

DRAG: Data Reconstruction Attack using Guided Diffusion

Authors: Wa-Kin Lei, Jun-Cheng Chen, Shang-Tse Chen

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art methods, both qualitatively and quantitatively, in reconstructing data from deep-layer IRs of the vision foundation model. The results highlight the urgent need for more robust privacy protection mechanisms for large models in SI scenarios.
Researcher Affiliation	Academia	1National Taiwan University 2Research Center for Information Technology Innovation, Academia Sinica.
Pseudocode	Yes	Algorithm 1 DRAG // Noise ϵ is sampled from N(0, I) for every usage s { .m = 0, .v = 0, .i = 0 } z T N(0, I) for t = T to 1 do for n = 1 to k do ˆx0 D(TWEEDIESESTIMATION(zt)) gt zt(d H(fc(ˆx0), h ) + λℓ2Rℓ2(ˆx0)) gt CLIPNORM(gt, cmax) gt, s STATEUPDATE(gt, s) zt 1 GUIDEDSAMPLING(zt, gt) zt p αt/αt 1 zt 1 + p 1 αt/αt 1 ϵ end for end for return D(z0) // Refine gt via momentum such as Adam function STATEUPDATE(gt, s) s.m β1 s.m + (1 β1) gt s.v β2 s.v + (1 β2) g2 t s.i s.i + 1 ˆm, ˆv s.m/ 1 βi 1 , s.v/ 1 βi 2 return gt, s end function function GUIDEDSAMPLING(zt, gt) ϵt r UNIT((1 w) σt ϵ + wr UNIT(gt)) return DDIM(zt, ϵθ(zt), ϵt) end function
Open Source Code	Yes	Code is available at: https: //github.com/ntuaislab/DRAG
Open Datasets	Yes	To evaluate our proposed methods, we sample 10 images from the official validation splits of each dataset: (1) MSCOCO (Lin et al., 2014), (2) FFHQ (Karras et al., 2019), and (3) Image Net-1K (Deng et al., 2009), constructing a collection of diverse natural images.
Dataset Splits	Yes	To evaluate our proposed methods, we sample 10 images from the official validation splits of each dataset: (1) MSCOCO (Lin et al., 2014), (2) FFHQ (Karras et al., 2019), and (3) Image Net-1K (Deng et al., 2009), constructing a collection of diverse natural images. All images are centercropped and resized to 224 224. We use Image Net-1K image classification as the primary task to quantitatively assess model utility. To simulate realistic conditions where the client and adversary have non-overlapping datasets, we randomly split the official training split of Image Net-1K into two distinct, equal-sized and non-overlapping subsets: a private portion Dprivate and a public portion Dpublic.
Hardware Specification	Yes	The experiments were conducted on a server equipped with 384 GB RAM, two Intel Xeon Gold 6226R CPUs, and eight NVIDIA RTX A6000 GPUs.
Software Dependencies	No	The implementation of r MLE (He et al., 2019), LM (Singh et al., 2021), DISCO (Singh et al., 2021) and No Peek (Vepakomma et al., 2020) are adapted from prior works.1 https://github.com/aidecentralized/Inference Benchmark
Experiment Setup	Yes	We list the hyperparameters for various optimization-based and learning-based reconstruction attacks in Table 9 and Table 10, respectively.