Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Differentially Private Empirical Risk Minimization under the Fairness Lens
Authors: Cuong Tran, My Dinh, Ferdinando Fioretto
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The proposed approach is evaluated on several datasets and settings. |
| Researcher Affiliation | Academia | Cuong Tran Syracuse University EMAIL My H. Dinh Syracuse University EMAIL Ferdinando Fioretto Syracuse University EMAIL |
| Pseudocode | Yes | Algorithm 1: DP-SGD input :Disjoint dataset D ; Sample prob. q; Iterations T; Noise variance σ2; Clipping bound C; learning rate η |
| Open Source Code | No | The paper does not provide an explicit statement about the release of source code for the methodology described, nor does it include a direct link to a code repository. |
| Open Datasets | Yes | The proposed approach is evaluated on several datasets and settings. ... on two datasets. Each data point represents the average of 100 runs of a DP Logistic Regression (obtained with output perturbation) on each group z A. Details on dataset and experimental setting are provided in Appendix B and additional experiments in Appendix C. ... Healthcare dataset stroke data. URL http://www.kaggle.com/fedesoriano/ stroke-prediction-dataset. ... UCI repository of machine learning databases, 1988. URL https: //archive.ics.uci.edu/ml/datasets.php. ... Telco customer churn dataset, 2015. URL http://www.ibm.com/communities/analytics/watson-analyticsblog/ predictive-insights-in-the-telco-customer-churn-data-set/. |
| Dataset Splits | No | The paper mentions 'Details on dataset and experimental setting are provided in Appendix B and additional experiments in Appendix C.' but does not explicitly provide specific training/test/validation dataset splits in the main text. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4). |
| Experiment Setup | Yes | Algorithm 1: DP-SGD input :Disjoint dataset D ; Sample prob. q; Iterations T; Noise variance σ2; Clipping bound C; learning rate η ... The experiment use C = 0.1 and σ = 1. ... The implementation uses a neural network with a single hidden layer and Suppose uses DP-SGD with C = 0.1, σ = 5.0. |