reproducibilityindex.ai

Parameters or Privacy: A Provable Tradeoff Between Overparameterization and Membership Inference

Authors: Jasper Tan, Blake Mason, Hamid Javadi, Richard Baraniuk

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In a theoretical direction, we prove for linear regression on Gaussian data in the overparameterized regime that increasing the number of parameters of the model increases its vulnerability to MI (Theorem 3.2). In a supporting empirical direction, we demonstrate that the same behavior holds for a range of more complex models: a latent space model, a time-series model, and a nonlinear random Re LU features model (Section 5).
Researcher Affiliation	Academia	Jasper Tan Rice University Blake Mason Rice University Hamid Javadi Rice University Richard G. Baraniuk Rice University
Pseudocode	No	The paper describes methods using mathematical formulations and prose, but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is provided in https://github.com/tanjasper/parameters_or_privacy.
Open Datasets	No	The paper describes data generation processes (e.g., 'data points (xi, yi) where xi N(0, Id) and yi = x>i β + i, where i N(0, σ2)') and defines specific data models (Latent Space Model, Time-Series Data, Random ReLU Features) for its experiments, rather than using or providing access information for existing publicly available datasets.
Dataset Splits	No	The paper describes how training data is sampled ('Sampling n data points, we denote by X the n D matrix whose ith row is x>i and by y the n-dimensional vector of elements yi') and how samples are treated for membership inference (m=0 vs m=1 cases), but it does not specify a separate validation dataset or its split percentages/counts for model tuning or evaluation.
Hardware Specification	Yes	Our experiments are implemented in Python 3.9 using PyTorch 1.10.1 and run on an Nvidia Tesla V100 GPU.
Software Dependencies	Yes	Our experiments are implemented in Python 3.9 using PyTorch 1.10.1 and run on an Nvidia Tesla V100 GPU.
Experiment Setup	Yes	In this experiment, we set n = 200, d = 20, and vary p. For each experiment, we sample a single x0 N(0, Id) and a single set of wj vectors, each from N(0, Id), and keep them ﬁxed. We leave the other variables random with the following distributions: zi N(0, Id), i N(0, σ2), β N(0, 1 d Id) and ui,j N(0, 1).