Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Easy Differentially Private Linear Regression
Authors: Kareem Amin, Matthew Joseph, Mónica Ribero, Sergei Vassilvitskii
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate all four algorithms on the following datasets. The first dataset is synthetic, and the rest are real. Our main experiments compare the four methods at (ln(3), 10 5)-DP. A concise summary of the experiment results appears in Figure 1. |
| Researcher Affiliation | Collaboration | EMAIL. Part of this work done while M onica was at UT Austin. |
| Pseudocode | Yes | Algorithm 1 PTRCheck. Algorithm 2 Tukey EM. |
| Open Source Code | Yes | All experiment code can be found on Github (Google, 2022). |
| Open Datasets | Yes | 1. Synthetic (d = 11, n = 22,000, Pedregosa et al. (2011)). 2. California (d = 9, n = 20,433, Nugent (2017)) predicting house price. 3. Diamonds (d = 10, n = 53,940, Agarwal (2017)), predicting diamond price. |
| Dataset Splits | No | No specific training/validation/test dataset splits (e.g., percentages, sample counts, or cross-validation setup) are explicitly stated in the paper. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running the experiments were provided. |
| Software Dependencies | No | The paper mentions software like "Tensor Flow Privacy and Keras (Chollet et al., 2015)" and "sklearn.make regression" but does not provide specific version numbers for these software dependencies, which are required for reproducibility. |
| Experiment Setup | Yes | Our experiments tune DPSGD over a large grid consisting of 2,184 joint hyperparameter settings, over learning rate {10 6, 10 5, . . . , 1}, clip norm {10 6, 10 5, . . . , 106}, microbatches {25, 26, . . . , 210}, and epochs {1, 5, 10, 20}. Figure 4: Hyperparameter settings used by DPSGD on each dataset. |