Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Ranking Recovery under Privacy Considerations

Authors: Minoh Jeong, Alex Dytso, Martina Cardone

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We derive the Taylor series of the probability of error, which yields its first and second-order approximations when such a linear decoder is employed. We quantify the guaranteed level of privacy using differential privacy (DP) types of metrics...Finally, we put together the results to characterize trade-offs between privacy and probability of error. ...we verify through numerical simulations that this approximation is indeed accurate. In particular, our first-order approximation expression decouples the effects of the input data distribution and noise distribution on the error probability. ...Figure 2 (see more figures in Appendix E.1) shows that this approximation is indeed accurate when N Rn is i.i.d. according to three different distributions, namely Gaussian (red curve), Laplace (blue curve), and generalized normal with p = 0.5 (green curve).
Researcher Affiliation Academia Minoh Jeong EMAIL Department of Electrical and Computer Engineering University of Minnesota Alex Dytso EMAIL Department of Electrical and Computer Engineering New Jersey Institute of Technology Martina Cardone EMAIL Department of Electrical and Computer Engineering University of Minnesota
Pseudocode No The paper contains mathematical derivations, proofs, and theoretical analyses. It does not include any explicitly labeled pseudocode blocks or algorithms formatted like code.
Open Source Code No The paper does not contain any explicit statement about releasing source code, nor does it provide a link to a code repository for the methodology described.
Open Datasets No The paper uses simulated data for its numerical evaluations, rather than referencing pre-existing publicly available datasets. For example, it states: "the components of X were chosen to be i.i.d. according to Unif(0, 100)" and "For the simulations illustrated in Figure 3, we set Unif(0, 1) and Exp(1) for Xi, i [1 : n] and N N(0n, σ2In)." No specific links, DOIs, or formal citations to external datasets are provided.
Dataset Splits No The paper uses Monte-Carlo simulations with generated data. It describes the characteristics of the generated data (e.g., i.i.d. Unif(0, 100), n=20) and the number of iterations (10^6), but does not involve explicit training/test/validation splits from a fixed dataset.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the numerical simulations described.
Software Dependencies No The paper does not mention any specific software dependencies or their version numbers (e.g., programming languages, libraries, frameworks) used for the simulations.
Experiment Setup Yes In Figure 2, the components of X were chosen to be i.i.d. according to Unif(0, 100) with n = 20. The solid curves (probability of error) were obtained by Monte-Carlo simulation with 106 iterations, while the dashed curves (first order approximation of the error probability) where obtained by simply evaluating (19). For the simulations illustrated in Figure 3, we set Unif(0, 1) and Exp(1) for Xi, i [1 : n] and N N(0n, σ2In). The curves for the true error probability Pe(φlin) were obtained by Monte-Carlo simulation using 106 iterations, whereas we obtained the curves for the first and second order approximations by evaluating the expression in Corollary 3.7. The data dimension is set to n = 10 for (a) and (b), and to n = 20 for (c) and (d).