Adaptive Data Analysis in a Balanced Adversarial Model

Authors: Kobbi Nissim, Uri Stemmer, Eliad Tsfadia

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Our first result is a construction of a balanced adversary forcing any computationally bounded mechanim to fail in Game 1.3. Theorem 1.5 (Computational lower bound, informal). There exists an efficient balanced adversary A = (A1, A2) that fails any computationally bounded mechanism M using Θ(n2) adaptive queries. Moreover, it does so by choosing a distribution over a small domain. Our construction in Theorem 1.5 uses the structure of previous attacks of Hardt and Ullman [2014] and Steinke and Ullman [2015a], but relies on a stronger hardness assumption of public-key cryptography. We prove that this is unavoidable. Theorem 1.6 (The necessity of public-key cryptography for proving Theorem 1.5, informal).
Researcher Affiliation Collaboration Kobbi Nissim Department of Computer Science Georgetown University kobbi.nissim@georgetown.edu Uri Stemmer Blavatnik School of Computer Science Tel Aviv University Google Research u@uri.co.il Eliad Tsfadia Department of Computer Science Georgetown University eliadtsfadia@gmail.com
Pseudocode Yes Protocol 1.9 (Key-Agreement Protocol (P1, P2) via a balanced adversary A = (A1, A2)). Input: 1n. Let ℓ= ℓ(n) and X = X(n) be the number queries and the domain that is used by the adversary A. Operation: P1 emulates A1 on input n for obtaining a distribution D over X (specified by a sampling procedure), and samples n i.i.d. samples S. P2 initializes an emulation of A2 on input n. For i = 1 to ℓ: 1. P2 receives the ith query qi from the emulated A2 and sends it to P1. 2. P1 computes yi = 1 x S qi(x), and sends it to P2. 3. P2 sends yi as the ith answer to the emulated A2. P1 and P2 agree on Ex D[qℓ(x)].
Open Source Code No The paper does not contain any statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets No The paper is theoretical and does not use or describe a publicly available dataset for training or evaluation. It discusses theoretical models involving "n i.i.d. samples from an unknown distribution D" but no actual datasets.
Dataset Splits No The paper is theoretical and does not discuss dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not mention any hardware specifications used for experiments.
Software Dependencies No The paper is theoretical and does not list any specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with hyperparameters or system-level training settings.