reproducibilityindex.ai

Reconstructing Training Data From Trained Neural Networks

Authors: Niv Haim, Gal Vardi, Gilad Yehudai, Ohad Shamir, Michal Irani

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate our method for binary MLP classifiers on a few standard computer vision datasets. In this section we exemplify our dataset reconstruction scheme on a toy example of 2-dimensional data... We conduct experiments on binary classification tasks where images are taken from the MNIST [Le Cun et al., 2010] and CIFAR10 [Krizhevsky et al., 2009] datasets...
Researcher Affiliation	Academia	Niv Haim Weizmann Institute of Science, Gal Vardi TTI-Chicago and Hebrew University, Gilad Yehudai Weizmann Institute of Science, Ohad Shamir Weizmann Institute of Science, Michal Irani Weizmann Institute of Science
Pseudocode	No	The paper describes the reconstruction scheme and optimization process in text and mathematical equations (Eq. 6, 7, 8) but does not provide a formal pseudocode block or algorithm steps.
Open Source Code	Yes	Project page: https://giladude1.github.io/reconstruction and Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] from the checklist.
Open Datasets	Yes	We conduct experiments on binary classification tasks where images are taken from the MNIST [Le Cun et al., 2010] and CIFAR10 [Krizhevsky et al., 2009] datasets
Dataset Splits	No	The paper mentions "normalize the train and test sets" and "use the original test sets of MNIST/CIFAR10 with 10000/8000 images respectively". While it implies standard train/test splits, it does not explicitly specify a validation split or its size/methodology.
Hardware Specification	No	The paper states "We train our models using full batch gradient descent for 10^6 epochs with a learning rate of 0.01." but does not provide specific details about the hardware (e.g., CPU, GPU models) used for training or experiments.
Software Dependencies	No	The paper mentions PyTorch in its references, but it does not specify the version number of PyTorch or any other software dependency used for its experiments within the text of the methodology or experimental setup sections.
Experiment Setup	Yes	Our models comprise of three fully-connected layers with dimensions d-1000-1000-1... The parameters are initialized using standard Kaiming He initialization... weights of the first layer that are initialized to a Gaussian distribution with standard deviation 10^-4... train our models using full batch gradient descent for 10^6 epochs with a learning rate of 0.01. and We minimize the loss defined in Eq. (8) with α1 = 1, α2 = 5, α3 = 1. We initialize xi N(0, σx I), where σx is a hyperparameter, and λi U[0, 1]... We optimize our loss for 100,000 iterations using an SGD optimizer with momentum 0.9.