Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Fast Optimal Locally Private Mean Estimation via Random Projections
Authors: Hilal Asi, Vitaly Feldman, Jelani Nelson, Huy Nguyen, Kunal Talwar
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conclude the paper with several experiments that demonstrate the performance of our proposed algorithms, comparing them to existing algorithms in the literature. We conduct our experiments in two different settings: the first is synthetic data, where we aim to test our algorithms and understand their performance for our basic task of private mean estimation, comparing them to other algorithms. In our second setting, we seek to evaluate the performance of our algorithms for private federated learning which requires private mean estimation as a subroutine for DP-SGD. |
| Researcher Affiliation | Collaboration | Hilal Asi Apple Inc. EMAIL Vitaly Feldman Apple Inc. EMAIL Jelani Nelson UC Berkeley EMAIL Huy L. Nguyen Northeastern University EMAIL Kunal Talwar Apple Inc. EMAIL |
| Pseudocode | Yes | Algorithm 1 Proj Unit (client), Algorithm 2 Proj Unit (server), Algorithm 3 Correlated Proj Unit (client), Algorithm 4 Correlated Proj Unit (server) |
| Open Source Code | Yes | The code is also available online on https://github.com/apple/ml-projunit |
| Open Datasets | Yes | Similarly to the experimental setup in [13], we consider the MNIST [31] dataset and train a convolutional network (see Table 2) using DP-SGD [1] with 10 epochs and different sub-routines for privately estimating the mean of gradients at each batch. |
| Dataset Splits | No | The paper mentions using the MNIST dataset and training a convolutional network, but it does not provide specific details about train/validation/test splits, percentages, or sample counts. |
| Hardware Specification | No | The paper discusses runtime performance but does not specify the exact hardware components (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or other libraries). |
| Experiment Setup | Yes | In order to bound the sensitivity, we clip the gradients to have ℓ2-norm 1, and run DP-SGD with batch size of 600, step-size equal to 0.1, and momentum of 0.5. |