Learning sparse neural networks via sensitivity-driven regularization

Authors: Enzo Tartaglione, Skjalg Lepsøy, Attilio Fiandrotti, Gianluca Francini

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we experiment with our regularization method in different supervised image classification tasks. Namely, we experiment training a number of well-known neural network architectures and over a number of different image datasets.
Researcher Affiliation Collaboration Enzo Tartaglione Politecnico di Torino Torino, Italy Skjalg Lepsøy Nuance Communications Torino, Italy Attilio Fiandrotti Politecnico di Torino, Torino, Italy Télécom Paris Tech, Paris, France Gianluca Francini Telecom Italia Torino, Italy
Pseudocode No The paper describes its update rule and training procedure in natural language text but does not include structured pseudocode or an algorithm block.
Open Source Code No The paper mentions implementation details ('Our method is implemented in Julia language and experiments are performed using the Knet package [22]') but does not provide a link or explicit statement that its source code is open or available.
Open Datasets Yes To start with, we experiment training the fully connected Le Net300 and the convolutional Le Net5 over the standard MNIST dataset [23] (60k training images and 10k test images). Finally, we experiment on the far more complex VGG-16 [1] network over the larger Image Net [25] dataset.
Dataset Splits No The paper mentions '60k training images and 10k test images' for MNIST, but does not explicitly provide details about a validation split for any dataset used.
Hardware Specification No The paper states 'Our method is implemented in Julia language and experiments are performed using the Knet package [22]' but provides no specific details about the hardware used for these experiments (e.g., CPU, GPU models).
Software Dependencies No The paper mentions 'Julia language', 'Knet package [22]', and 'keras pretrained model [1]' but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes We use SGD with a learning parameter η = 0.1, a regularization factor λ = 10 5 and a thresholding value T = 10 3 unless otherwise specified. No other sparsity-promoting method (dropout, batch normalization) is used. For the sparsity step we have used SGD with η = 10 3 and λ = 10 5 for the specific sensitivity, λ = 10 6 for the unspecific sensitivity.