Deconstructing Data Reconstruction: Multiclass, Weight Decay and General Losses

Authors: Gon Buzaglo, Niv Haim, Gilad Yehudai, Gal Vardi, Yakir Oz, Yaniv Nikankin, Michal Irani

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct the following experiment: we train an MLP classifier with architecture D-1000-1000-C on samples from the CIFAR10 [Krizhevsky et al., 2009] dataset.
Researcher Affiliation Academia Gon Buzaglo 1 Niv Haim 1 Gilad Yehudai1 Gal Vardi2 Yakir Oz1 Yaniv Nikankin1 Michal Irani1 1Weizmann Institute of Science, Rehovot, Israel 2TTI-Chicago and the Hebrew University of Jerusalem
Pseudocode No The paper describes the reconstruction algorithm steps, but it does not present them in a structured pseudocode or algorithm block.
Open Source Code Yes Code: https://github.com/gonbuzaglo/decoreco
Open Datasets Yes CIFAR10 [Krizhevsky et al., 2009] and MNIST [Le Cun et al., 2010], SVHN dataset [Netzer et al., 2011], CIFAR100 dataset
Dataset Splits No The paper mentions the number of training samples used (e.g., "500 samples", "N training samples") and test accuracy, but does not explicitly provide the specific percentages or absolute counts for training, validation, and test splits required for reproduction.
Hardware Specification Yes Runtime of a single reconstruction run (specific choice of hyperparameters) from a model D-1000-1000-1 takes about 20 minutes on a GPU Tesla V-100 32GB or NVIDIA Ampere Tesla A40 48GB.
Software Dependencies No The paper mentions using "Weights & Biases framework" but does not provide specific version numbers for software components or libraries.
Experiment Setup Yes The models that were reconstructed in the main part of the paper were trained with learning rates of 0.01 for binary classifiers (both MLP and convolutional), and 0.5 in the case of multi-class classifier (Section 4). The models were trained with full batch gradient descent for 10^6 epochs, to guarantee convergence to a KKT point of Eq. (1) or a local minima of Eq. (13). When small initialization of the first layer is used (e.g., in Figs. 2 and 3), the weights are initialized with a scale of 10^-4. The exact parameters for the hyperparameters search are: Learning rate: log-uniform in [10^-5, 1] σx: log-uniform in [10^-6, 0.1] λmin: uniform in [0.01, 0.5] α: uniform in [10, 500]