reproducibilityindex.ai

Learning Differentially Private Recurrent Language Models

Authors: H. Brendan McMahan, Daniel Ramage, Kunal Talwar, Li Zhang

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our work demonstrates that given a dataset with a sufﬁciently large number of users (a requirement easily met by even small internet-scale datasets), achieving differential privacy comes at the cost of increased computation, rather than in decreased utility as in most prior work. We ﬁnd that our private LSTM language models are quantitatively and qualitatively similar to un-noised models when trained on a large dataset. In extensive experiments in 3, we offer guidelines for parameter tuning when training complex models with differential privacy guarantees.
Researcher Affiliation	Industry	H. Brendan Mc Mahan mcmahan@google.com Daniel Ramage dramage@google.com Kunal Talwar kunal@google.com Li Zhang liqzhang@google.com
Pseudocode	Yes	The pseudocode for DP-Fed Avg and DP-Fed SGD is given as Algorithm 1. In the remainder of this section, we introduce estimators for C) and then different clipping strategies for B). Adding the sampling procedure from A) and noise added in D) allows us to apply the moments accountant to bound the total privacy loss of the algorithm, given in Theorem 1. Finally, we consider the properties of the moments accountant that make training on large datasets particular attractive. Algorithm 1: The main loop for DP-Fed Avg and DP-Fed SGD, the only difference being in the user update function (User Update Fed Avg or User Update Fed SGD). The calls on the moments accountant M refer to the API of Abadi et al. (2016b).
Open Source Code	No	The paper mentions using an implementation of the moments accountant from Abadi et al. (2016b) and provides its GitHub link, but it does not state that the code for the methodology described in this paper is open-source or publicly available.
Open Datasets	Yes	However, to facilitate reproducibility and comparison to non-private models, our experiments are conducted on a public dataset as is standard in differential privacy research. We use a large public dataset of Reddit posts, as described by Al-Rfou et al. (2016). Critically for our purposes, each post in the database is keyed by an author, so we can group the data by these keys in order to provide user-level privacy. We preprocessed the dataset to K = 763, 430 users each with 1600 tokens. Thus, we take wk = 1 for all users, so W = K. We write C = q K = q W for the expected number of users sampled per round. See B for details on the dataset and preprocessing. The Reddit dataset can be accessed through Google Big Query (Reddit Comments Dataset).
Dataset Splits	No	The paper mentions training on the Reddit dataset and using a 'relatively small test set'. It does not explicitly mention or specify details for a separate validation set.
Hardware Specification	No	The paper does not explicitly mention specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions using an 'implementation of the moments accountant' and refers to TensorFlow (via a GitHub link in the references), but it does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	For these experiments, we use the Fed Avg algorithm with a ﬁxed learning rate of 6.0, which we veriﬁed was a reasonable choice in preliminary experiments. In all Fed Avg experiments, we used a local batch size of B = 8, an unroll size of 10 tokens, and made E = 1 passes over the local dataset; thus Fed Avg processes 80 tokens per batch, processing a user s 1600 tokens in 20 batches per round.