Radioactive data: tracing through training

Authors: Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Herve Jegou

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on large-scale benchmarks (Imagenet), with standard architectures (Resnet-18, VGG-16, Densenet-121) and training procedures, show that we detect radioactive data with high confidence (p <0.0001) when only 1% of the data used to train a model is radioactive.
Researcher Affiliation Collaboration 1Facebook AI Research, Paris 2Inria, Grenoble.
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement about releasing source code or a link to a code repository.
Open Datasets Yes We employ the widely-used benchmarks Imagenet (Deng et al., 2009), a dataset of natural images with 1.2M images belonging to 1,000 classes and Places205 (Zhou et al., 2014), a dataset of 2.4M images from 205 scene categories.
Dataset Splits No While a 'validation set' is mentioned (Section 3.4: 'In practice, we use vanilla images of a held-out set (the validation set) to estimate M.'), the paper does not provide specific split percentages or sample counts for the training/validation/test sets to reproduce the data partitioning.
Hardware Specification No The paper mentions 'across 8 GPUs' but does not specify the model or type of GPUs or any other specific hardware components used for experiments.
Software Dependencies No The paper mentions 'We use Pytorch (Paszke et al., 2017)' but does not specify the version number of Pytorch or any other software dependencies with their versions.
Experiment Setup Yes We train with SGD with a momentum of 0.9 and a weight decay of 10 4 for 90 epochs, using a batch size of 2048 across 8 GPUs. We use the waterfall schedule for the learning rate: it starts at 0.8 and is divided by 10 every 30 epochs. Radioactive data are generated by running SGD by optimizing Equation (5) with R = 10, λ1 = 0.0005 and λ2 = 0.01.