reproducibilityindex.ai

A Winning Hand: Compressing Deep Networks Can Improve Out-of-Distribution Robustness

Authors: James Diffenderfer, Brian Bartoldson, Shreya Chaganti, Jize Zhang, Bhavya Kailkhura

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To test the CARD hypothesis, we use: five models (VGG [12, 46] and ResNet [19] style architectures of varying size), five sparsity levels (50%, 60%, 80%, 90%, 95%), and six model compression methods (FT, GMP, LTH, LRR, EP, BP). For each model, sparsity level, and compression method, five realizations are trained on the CIFAR-10 training set [28]. Model accuracy and robustness are measured using top-1 accuracy on the CIFAR-10 and CIFAR-10-C test sets, respectively.
Researcher Affiliation	Academia	James Diffenderfer, Brian R. Bartoldson , Shreya Chaganti , Jize Zhang, Bhavya Kailkhura Lawrence Livermore National Laboratory {diffenderfer2, bartoldson, chaganti1, zhang64, kailkhura1}@llnl.gov
Pseudocode	No	The paper describes methods and processes but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The abstract states "pretrained CARDs available at [8]", which refers to the Robustbench benchmark, not explicitly the open-source code for the methodology described in this paper.
Open Datasets	Yes	For each model, sparsity level, and compression method, five realizations are trained on the CIFAR-10 training set [28].
Dataset Splits	No	The paper mentions training on CIFAR-10 and testing on CIFAR-10 and CIFAR-10-C, but does not explicitly specify a separate validation dataset split or its size for their own experiments.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, or cloud computing instances used for experiments.
Software Dependencies	No	The paper mentions various methods and frameworks like Aug Mix, Robust Bench, ResNet, VGG, and OpenLTH, but it does not specify software dependencies with version numbers (e.g., Python, PyTorch, or CUDA versions).
Experiment Setup	Yes	We use the hyperparameters specifically tuned for each approach; see Appendix B for additional details. For each model, sparsity level, and compression method, five realizations are trained on the CIFAR-10 training set [28]. We experiment with four models of increasing size... three data augmentation methods (clean, Aug Mix, Gaussian), two sparsity levels (90%, 95%), and six compression methods... We also formed domain-agnostic and domain-adaptive n-CARD-Decks of size n {2, 4, 6} comprised of models using the same compression method and sparsity level. For each n-CARD-Deck, half of the CARDs were trained using Aug Mix and the other half were trained using the Gaussian augmentation. In our experiments, we took P = 5000 and these KD-Trees were generated once and saved (separate from inference process). At test time, batches of M = 100 test images were used in the spectral-similarity metric.