Dataset Inference: Ownership Resolution in Machine Learning
Authors: Pratyush Maini, Mohammad Yaghini, Nicolas Papernot
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on CIFAR10, SVHN, CIFAR100 and Image Net show that model owners can claim with confidence greater than 99% that their model (or dataset as a matter of fact) was stolen, despite only exposing 50 of the stolen model s training points. |
| Researcher Affiliation | Academia | Pratyush Maini IIT Delhi pratyush.maini@gmail.com Mohammad Yaghini, Nicolas Papernot University of Toronto and Vector Institute {mohammad.yaghini,nicolas.papernot}@utoronto.ca Work done while an intern at the University of Toronto and Vector Institute |
| Pseudocode | No | The paper describes its methods in detail but does not include any explicitly labeled pseudocode blocks or algorithms. |
| Open Source Code | Yes | Code and models for reproducing our work can be found at github.com/cleverhans-lab/dataset-inference |
| Open Datasets | Yes | Datasets. We perform our experiments on the CIFAR10, CIFAR100, SVHN and Image Net datasets. These remain popular image classification benchmarks, further description about which can be found in Appendix E.1. |
| Dataset Splits | Yes | The victim trains the confidence regressor with the helps of embeddings generated by querying the Wide Res Net-50-2 architecture over the training and validation sets separately. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for its experiments, such as GPU models, CPU types, or memory. |
| Software Dependencies | No | The paper mentions software components like "SGD optimizer" and model architectures (e.g., "Wide Res Net"), but it does not specify version numbers for any programming languages, libraries, or frameworks used (e.g., Python version, PyTorch/TensorFlow version). |
| Experiment Setup | Yes | In all the training methods, we use a fixed learning rate strategy with SGD optimizer and decay the learning rate by a factor of 0.2 at the end of the 0.3 , 0.6 , and 0.8 the total number of epochs. |