reproducibilityindex.ai

Stealing part of a production language model

Authors: Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvijotham, Thomas Steinke, Jonathan Hayase, A. Feder Cooper, Katherine Lee, Matthew Jagielski, Milad Nasr, Arthur Conmy, Eric Wallace, David Rolnick, Florian Tramèr

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments. In order to visualize the intuition behind this attack, Figure 1 illustrates an attack against the Pythia1.4b LLM. Here, we plot the magnitude of the singular values of Q as we send an increasing number n of queries to the model. We now analyze the efﬁcacy of this attack across a wider range of models: GPT-2 (Radford et al., 2019) Small and XL, Pythia (Biderman et al., 2023) 1.4B and 6.9B, and LLa MA (Touvron et al., 2023) 7B and 65B. The results are in Table 2: our attack recovers the embedding size nearly perfectly, with an error of 0 or 1 in ﬁve out of six cases. Evaluation. We now study the efﬁcacy of our practical stealing attack.
Researcher Affiliation	Collaboration	1Google Deep Mind 2ETH Zurich 3University of Washington 4Open AI 5Mc Gill University.
Pseudocode	Yes	Algorithm 1 Hidden-Dimension Extraction Attack
Open Source Code	Yes	We release supplementary code that deals with testing these attacks without direct API queries at https://github.com/ dpaleka/stealing-part-lm-supplementary.
Open Datasets	Yes	We now analyze the efﬁcacy of this attack across a wider range of models: GPT-2 (Radford et al., 2019) Small and XL, Pythia (Biderman et al., 2023) 1.4B and 6.9B, and LLa MA (Touvron et al., 2023) 7B and 65B.
Dataset Splits	No	The paper does not explicitly describe training/validation/test splits for its own experimental setup, as it focuses on attacking existing models rather than training new ones. Therefore, the concept of a 'validation split' as typically applied to training data is not relevant to their direct experiments.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions using "bitsandbytes (Dettmers et al., 2022)" for quantization, but does not specify its version number or versions for other key software components, which is required for reproducibility.
Experiment Setup	No	The paper describes the attack methodology and parameters like cost and queries, but it does not detail experimental setup in terms of hyperparameters or system-level training settings, as it is attacking pre-existing models rather than training new ones.