Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Personalized Privacy Amplification via Importance Sampling

Authors: Dominik Fay, Sebastian Mair, Jens Sjölund

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate both approaches empirically in terms of privacy, efficiency, and accuracy on the differentially private k-means problem. We observe that both approaches yield similar outcomes and consistently outperform uniform sampling across a wide range of data sets. Our code is available on Git Hub.1
Researcher Affiliation	Collaboration	Dominik Fay EMAIL Elekta and KTH Royal Institute of Technology, Stockholm, Sweden Sebastian Mair EMAIL Linköping University and Uppsala University, Sweden Jens Sjölund EMAIL Uppsala University, Sweden
Pseudocode	Yes	Algorithm 1 Optimization for privacy-constrained importance weights
Open Source Code	Yes	Our code is available on Git Hub.1
Open Datasets	Yes	We use the following eight real-world data sets: Covertype (Blackard & Dean, 1999) (n = 581,012, d = 54), FMA2 (Defferrard et al., 2017) (n = 106,574, d = 518), Ijcnn1 3 (Chang & Lin, 2001) (n = 49,990, d = 22), KDD-Protein4 (n = 145,751, d = 74), Mini Boo NE (Dua & Graff, 2017) (n = 130,064, d = 50), Pose5 (Catalin Ionescu, 2011; Ionescu et al., 2014) (n = 35,832, d = 48), RNA (Uzilov et al., 2006) (n = 488,565, d = 8), and Song (Bertin-Mahieux et al., 2011) (n = 515,345, d = 90).
Dataset Splits	No	The paper does not provide explicit training/test/validation dataset splits, percentages, or specific file names for custom splits. It mentions preprocessing steps like setting an ℓ2 cut-off point and centering the data, but no partition details for reproducibility of data splits.
Hardware Specification	Yes	All experiments run on a dual AMD Epyc machine with 2 × 64 cores with 2.25 GHz and 2 Ti B of memory.
Software Dependencies	No	The code6 is implemented in Python using numpy (Harris et al., 2020). Algorithm 1 is implemented in Rust. The paper mentions software but does not specify version numbers for Python, Rust, or numpy, which are crucial for reproducibility.
Experiment Setup	Yes	We fix the number of iterations of DP-Lloyd to T = 10 and the number of clusters to k = 25. The scale parameters are set to βsum = q d 2ρ and βcount = 3p 4dρ2βsum, where ρ = 0.225 as suggested by Su et al. (2016). Here, B is a constant controlling the noise scales that we select to achieve a specific target epsilon ϵ ∈ {0.5, 1, 3, 10, 30, 100, 300, 1000.0} for a given (expected) subsample size m and vice versa. ... For the coreset-based (core) sampling, the sampling distribution is qcore(x) = λ m n + (1 − λ) m x 2 2 n x , where we set λ = 1/2. ... For the privacy-constrained (opt) sampling, we compute qpriv(x) numerically via Algorithm 1 for the target ϵ = (r/βsum + 1/βcount)T using the strong convexity constant from Equation (4). ... Furthermore, we initialize the cluster centers as k random data points.