Net-Trim: Convex Pruning of Deep Neural Networks with Performance Guarantee

Authors: Alireza Aghasi, Afshin Abdi, Nam Nguyen, Justin Romberg

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we present some experiments to highlight its performance against the state of the art techniques. We next apply Net-Trim to the problem of classifying hand-written digits of the mixed national institute of standards and technology (MNIST) dataset.
Researcher Affiliation Collaboration Alireza Aghasi Institute for Insight Georgia State University IBM TJ Watson aaghasi@gsu.edu Afshin Abdi Department of ECE Georgia Tech abdi@gatech.edu Nam Nguyen IBM TJ Watson nnguyen@us.ibm.com Justin Romberg Department of ECE Georgia Tech jrom@ece.gatech.edu
Pseudocode Yes Algorithm 1 Parallel Net-Trim
Open Source Code Yes The authors have made the implementation publicly available online3. 3The code for the regularized Net-Trim implementation using the ADMM scheme can be accessed online at: https://github.com/DNNTool Box/Net-Trim-v1
Open Datasets Yes We next apply Net-Trim to the problem of classifying hand-written digits of the mixed national institute of standards and technology (MNIST) dataset. The set contains 60,000 training samples and 10,000 test instances.
Dataset Splits No The paper states 'The set contains 60,000 training samples and 10,000 test instances' for the MNIST dataset, but does not explicitly mention a validation split.
Hardware Specification No The paper mentions distributing jobs 'among a cluster of processing units (in our case 64) or using a GPU', but does not provide specific details such as GPU model numbers or CPU types.
Software Dependencies No The paper states 'The authors have made the implementation publicly available online3. 3The code for the regularized Net-Trim implementation using the ADMM scheme can be accessed online at: https://github.com/DNNTool Box/Net-Trim-v1' and mentions using ADMM, but it does not specify version numbers for any software dependencies or solvers used in the experiments.
Experiment Setup No The paper describes network architectures (e.g., '784 300 300 10 network', 'two convolutional layers composed of 32 filters of size 5 5 1'), and the number of training samples, but it does not provide specific hyperparameter values such as learning rates, batch sizes, or number of epochs for training the neural networks.