reproducibilityindex.ai

DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures

Authors: Huanrui Yang, Wei Wen, Hai Li

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that enforcing Deep Hoyer regularizers can produce even sparser neural network models than previous works, under the same accuracy level. We also show that Deep Hoyer can be applied to both element-wise and structural pruning.
Researcher Affiliation	Academia	Huanrui Yang, Wei Wen, Hai Li Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708 {huanrui.yang, wei.wen, hai.li}@duke.edu
Pseudocode	No	The paper describes the proposed regularizers and their gradients using mathematical equations and textual descriptions, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The codes are available at https://github.com/yanghr/Deep Hoyer.
Open Datasets	Yes	The MNIST dataset (Le Cun et al., 1998) is a well known handwritten digit dataset consists of greyscale images with the size of 28 28 pixels. We use the dataset API provided in the torchvision python package to access the dataset. [...] The Image Net dataset... (Russakovsky et al., 2015), which can be found at http://www.image-net.org/challenges/LSVRC/ 2012/nonpub-downloads. [...] We also use the CIFAR-10 dataset (Krizhevsky & Hinton, 2009) to evaluate the structural pruning performance on Res Net-56 and Res Net-110 models. The CIFAR-10 dataset can be directly accessed through the dataset API provided in the torchvision python package.
Dataset Splits	Yes	We use all the data in the provided training set to train our model, and use the provided validation set to evaluate our model and report the testing accuracy.
Hardware Specification	Yes	All the MNIST experiments are done with a single TITAN XP GPU. [...] Two TITAN XP GPUs are used in parallel for the Alex Net training and four are used for the Res Net-50 training.
Software Dependencies	No	The paper mentions software like "Py Torch deep learning framework" and "torchvision python package" and optimizers like "Adam optimizer" and "SGD optimizer", but it does not provide specific version numbers for any of these software components.
Experiment Setup	Yes	Adam optimizer (Kingma & Ba, 2014) with learning rate 0.001 is used throughout the training process. All the MNIST experiments are done with a single TITAN XP GPU. [...] Detailed parameter choices used in achieving the reported results are listed in Table 6. [...] For the Res Net-50 experiments on Image Net, [...] All the models are optimized with the SGD optimizer Sutskever et al. (2013), and the batch size is chosen as 256 for all the experiments.