Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Scaling Up Exact Neural Network Compression by ReLU Stability

Authors: Thiago Serra, Xin Yu, Abhinav Kumar, Srikumar Ramalingam

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We trained and evaluated the compressibility of classifiers for the datasets MNIST [53], CIFAR-10 [48], and CIFAR-100 [48] with and without ℓ1 weight regularization, which is known to induce stability [90]. We refer to Appendix A6 for details on environment and implementation. We use the notation L n for the architecture of L hidden layers with n neurons each. We started at L = 2 and n = 100, and then doubled the width n or incremented the depth L until the majority of the runs for MNIST classifiers for any configuration timed out after 3 hours. With preliminary runs, we chose values for ℓ1 which spanned from those for which accuracy is improving as ℓ1 increases until those for which the accuracy starts decreasing. We trained and evaluated neural networks with 5 different random initialization seeds for each choice of ℓ1. The amount of regularization used did not stabilize the entire layer. We refer to Appendix A7 for additional figures and tables with complete results.
Researcher Affiliation Collaboration Thiago Serra Bucknell University Lewisburg, PA, United States EMAIL Xin Yu University of Utah Salt Lake City, UT, United States EMAIL Abhinav Kumar Michigan State University East Lansing, MI, United States EMAIL Srikumar Ramalingam Google Research New York, NY, United States EMAIL
Pseudocode Yes Algorithm 1, which we denote ISA (Identifying Stable Activations), identifies all stable neurons of a neural network.
Open Source Code Yes The code is available at the following link, https://github.com/yuxwind/Exact Compression.
Open Datasets Yes We trained and evaluated the compressibility of classifiers for the datasets MNIST [53], CIFAR-10 [48], and CIFAR-100 [48] with and without ℓ1 weight regularization, which is known to induce stability [90].
Dataset Splits No The paper does not explicitly state training/validation/test dataset splits with percentages or sample counts. While it refers to preprocessing on the 'training set' and evaluation on the 'test set', it does not define a separate validation set or its split.
Hardware Specification Yes All experiments are run on a Linux server with 40 CPUs, 180 GB memory and a Nvidia GeForce RTX 2080 Ti GPU.
Software Dependencies Yes We use Gurobi 9.1 as the MILP solver and PyTorch 1.7.0 as the deep learning framework.
Experiment Setup Yes For all networks, we used the Adam optimizer, trained with 50 epochs, initial learning rate of 1e-3, and a batch size of 128.