Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
DisGUIDE: Disagreement-Guided Data-Free Model Extraction
Authors: Jonathan Rosenthal, Eric Enouen, Hung Viet Pham, Lin Tan
AAAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our evaluation on popular datasets CIFAR-10 and CIFAR-100 shows that our approach improves the final model accuracy by up to 3.42% and 18.48% respectively. The average number of queries required to achieve the accuracy of the prior state of the art is reduced by up to 64.95%. |
| Researcher Affiliation | Academia | Jonathan Rosenthal1, Eric Enouen2*, Hung Viet Pham3 , Lin Tan1 1 Purdue University 2 The Ohio State University 3 York University |
| Pseudocode | No | The paper describes the Dis GUIDE training process but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Dis GUIDE codebase: https://github.com/lin-tan/disguide |
| Open Datasets | Yes | We evaluate Dis GUIDE on the two widely-used image classification datasets CIFAR-10 and CIFAR-100 (Krizhevsky and Hinton 2009). |
| Dataset Splits | No | The paper refers to 'held-out test sets' and 'training data' but does not provide specific train/validation/test split percentages, sample counts for each split, or explicit methodology for creating these splits. It uses well-known datasets like CIFAR-10 and CIFAR-100 which have standard splits, but these are not explicitly detailed within the paper's text. |
| Hardware Specification | Yes | We conduct our experiments on a server with 48 CPU cores with 504 GB of RAM and 2080Ti GPUs. |
| Software Dependencies | Yes | Our code uses Pytorch 1.11 and CUDA 10.2. |
| Experiment Setup | Yes | We use the same generator training hyperparameters as DFME: a batch size of 256, Adam optimizer with an initial learning rate of 1e-4 and weight decay of 5e-4. Similarly, we use DFME s hyperparameters for clone training: batch size of 256, SGD with an initial learning rate of 0.1, and the same weight decay as above. ... We select b = 3 as well as a replay buffer size s = 1M ... We empirically set the class diversity loss weight λ = 0.2 and λ = 0.04 for CIFAR-10 and CIFAR-100 experiments respectively. ... We empirically select 1/8 of the generated samples to be set to grayscale. |