Deconstructing Data Reconstruction: Multiclass, Weight Decay and General Losses
Authors: Gon Buzaglo, Niv Haim, Gilad Yehudai, Gal Vardi, Yakir Oz, Yaniv Nikankin, Michal Irani
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct the following experiment: we train an MLP classifier with architecture D-1000-1000-C on samples from the CIFAR10 [Krizhevsky et al., 2009] dataset. |
| Researcher Affiliation | Academia | Gon Buzaglo 1 Niv Haim 1 Gilad Yehudai1 Gal Vardi2 Yakir Oz1 Yaniv Nikankin1 Michal Irani1 1Weizmann Institute of Science, Rehovot, Israel 2TTI-Chicago and the Hebrew University of Jerusalem |
| Pseudocode | No | The paper describes the reconstruction algorithm steps, but it does not present them in a structured pseudocode or algorithm block. |
| Open Source Code | Yes | Code: https://github.com/gonbuzaglo/decoreco |
| Open Datasets | Yes | CIFAR10 [Krizhevsky et al., 2009] and MNIST [Le Cun et al., 2010], SVHN dataset [Netzer et al., 2011], CIFAR100 dataset |
| Dataset Splits | No | The paper mentions the number of training samples used (e.g., "500 samples", "N training samples") and test accuracy, but does not explicitly provide the specific percentages or absolute counts for training, validation, and test splits required for reproduction. |
| Hardware Specification | Yes | Runtime of a single reconstruction run (specific choice of hyperparameters) from a model D-1000-1000-1 takes about 20 minutes on a GPU Tesla V-100 32GB or NVIDIA Ampere Tesla A40 48GB. |
| Software Dependencies | No | The paper mentions using "Weights & Biases framework" but does not provide specific version numbers for software components or libraries. |
| Experiment Setup | Yes | The models that were reconstructed in the main part of the paper were trained with learning rates of 0.01 for binary classifiers (both MLP and convolutional), and 0.5 in the case of multi-class classifier (Section 4). The models were trained with full batch gradient descent for 10^6 epochs, to guarantee convergence to a KKT point of Eq. (1) or a local minima of Eq. (13). When small initialization of the first layer is used (e.g., in Figs. 2 and 3), the weights are initialized with a scale of 10^-4. The exact parameters for the hyperparameters search are: Learning rate: log-uniform in [10^-5, 1] σx: log-uniform in [10^-6, 0.1] λmin: uniform in [0.01, 0.5] α: uniform in [10, 500] |