Beyond Slow Signs in High-fidelity Model Extraction

Authors: Hanna Foerster, Robert Mullins, I Shumailov, Jamie Hayes

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our study evaluates the feasibility of parameter extraction methods of Carlini et al. [1] further enhanced by Canales-Martínez et al. [2] for models trained on standard benchmarks. We introduce a unified codebase that integrates previous methods and reveal that computational tools can significantly influence performance. We develop further optimisations to the end-to-end attack and improve the efficiency of extracting weight signs by up to 14.8 times compared to former methods through the identification of easier and harder to extract neurons.
Researcher Affiliation Collaboration Hanna Foerster and Robert Mullins University of Cambridge Ilia Shumailov and Jamie Hayes Google DeepMind
Pseudocode No The paper describes methods in text but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Our codebase can be found at https://github.com/hannafoe/cryptanalytical-extraction.
Open Datasets Yes In our work, we use models trained on MNIST and CIFAR10, as well as randomly generated data, to benchmark full parameter extraction.
Dataset Splits No The paper mentions 'test accuracy' but does not provide specific details on train/validation/test dataset splits, percentages, or sample counts.
Hardware Specification Yes The performance was tested on AMD Ryzen 7 4700U processor with 16GB RAM. All extractions in this table were run on a High Performance Cluster with Intel s 10th generation Intel Core processors icelake.
Software Dependencies No The paper mentions machine learning libraries 'jax' and 'TensorFlow' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes Following the discovery that the confidence level and the number of correctly recovered neurons stabilises after about s = 15 iterations, we propose that sign extraction can be made more efficient by running it with less iterations. If ||s L s R|| 10 13, then the process is aborted and restarts at a new critical point... The threshold of 10 13 is used due to the precision limitations of float64 being at about 10 15 and correct wiggles typically impacting the output in the order of 10 9... Carlini et al. [1] set this to 0.1 to start with, as this was a value with which approximately half of the attempts at finding a new witness x i failed. If too many solutions are found, θ is reduced to half and if no solutions are found θ is multiplied by 1.1.