Pruning has a disparate impact on model accuracy
Authors: Cuong Tran, Ferdinando Fioretto, Jung-Eun Kim, Rakshit Naidu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The theoretical findings suggest the presence of two key factors responsible for why accuracy disparities arise in pruned models: (1) disparity in gradient norms across groups, and (2) disparity in Hessian matrices associated with the loss function computed using a group s data. Informally, the former carries information about the groups local optimality, while the latter relates to model separability. We analyze these factors in detail, providing both theoretical and empirical support on a variety of settings, networks, and datasets. |
| Researcher Affiliation | Academia | Cuong Tran Department of Computer Science Syracuse University cutran@syr.edu Ferdinando Fioretto Department of Computer Science Syracuse University ffiorett@syr.edu Jung-Eun Kim Department of Computer Science North Carolina State University jung-eun.kim@ncsu.edu Rakshit Naidu Department of Computer Science Carnegie Mellon University rnemakal@andrew.cmu.edu |
| Pseudocode | No | The paper describes methods in prose and mathematical equations but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide any links to a code repository. |
| Open Datasets | Yes | These results use the UTKFace dataset [41] for a vision task whose goal is to classify ethnicity. The experiments use a Res Net-18 architecture and the pruning counterparts remove the P% parameters with the smallest absolute values for various P. All reported metrics are normalized and an average of 10 repetitions. |
| Dataset Splits | No | The paper mentions using datasets like UTKFace, CIFAR-10, and SVHN, and training models, but it does not explicitly state the training, validation, and test dataset splits (e.g., percentages or sample counts for each split) within the main body or referenced appendices. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions neural network architectures (e.g., Res Net-18, Res Net50, VGG19) and optimizers (SGD) but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | The models are trained using SGD optimizer with momentum 0.9 and initial learning rate 0.01 with cosine annealing scheduler for 100 epochs. |