Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Differentially Private Bagging: Improved utility and cheaper privacy than subsample-and-aggregate
Authors: James Jordon, Jinsung Yoon, Mihaela van der Schaar
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the improvements our model makes over standard subsample-and-aggregate in two datasets (Heart Failure (private) and UCI Adult (public)). |
| Researcher Affiliation | Academia | James Jordon University of Oxford EMAIL Jinsung Yoon University of California, Los Angeles EMAIL Mihaela van der Schaar University of Cambridge University of California, Los Angeles Alan Turing Institute EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Semi-supervised differentially private knowledge transfer using multiple partitions |
| Open Source Code | Yes | Implementation of DPBag can be found at https://bitbucket.org/mvdschaar/mlforhealthlabpub/src/master/alg/dpbag/. |
| Open Datasets | Yes | We demonstrate the improvements our model makes over standard subsample-and-aggregate in two datasets (Heart Failure (private) and UCI Adult (public)). |
| Dataset Splits | Yes | We randomly divide the data into 3 disjoint subsets: (1) a training set (33%), (2) public data (33%), (3) a testing set (33%). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'logistic regression' and 'Gradient Boosting Method (GBM)' as models but does not specify software names with version numbers for implementation or dependencies. |
| Experiment Setup | Yes | We set δ = 10 5. We vary ϵ {1, 3, 5}, n {50, 100, 250} and k {10, 50, 100}. In all cases we set λ = 2 n. |