Learning sparse neural networks via sensitivity-driven regularization
Authors: Enzo Tartaglione, Skjalg Lepsøy, Attilio Fiandrotti, Gianluca Francini
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we experiment with our regularization method in different supervised image classification tasks. Namely, we experiment training a number of well-known neural network architectures and over a number of different image datasets. |
| Researcher Affiliation | Collaboration | Enzo Tartaglione Politecnico di Torino Torino, Italy Skjalg Lepsøy Nuance Communications Torino, Italy Attilio Fiandrotti Politecnico di Torino, Torino, Italy Télécom Paris Tech, Paris, France Gianluca Francini Telecom Italia Torino, Italy |
| Pseudocode | No | The paper describes its update rule and training procedure in natural language text but does not include structured pseudocode or an algorithm block. |
| Open Source Code | No | The paper mentions implementation details ('Our method is implemented in Julia language and experiments are performed using the Knet package [22]') but does not provide a link or explicit statement that its source code is open or available. |
| Open Datasets | Yes | To start with, we experiment training the fully connected Le Net300 and the convolutional Le Net5 over the standard MNIST dataset [23] (60k training images and 10k test images). Finally, we experiment on the far more complex VGG-16 [1] network over the larger Image Net [25] dataset. |
| Dataset Splits | No | The paper mentions '60k training images and 10k test images' for MNIST, but does not explicitly provide details about a validation split for any dataset used. |
| Hardware Specification | No | The paper states 'Our method is implemented in Julia language and experiments are performed using the Knet package [22]' but provides no specific details about the hardware used for these experiments (e.g., CPU, GPU models). |
| Software Dependencies | No | The paper mentions 'Julia language', 'Knet package [22]', and 'keras pretrained model [1]' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | We use SGD with a learning parameter η = 0.1, a regularization factor λ = 10 5 and a thresholding value T = 10 3 unless otherwise specified. No other sparsity-promoting method (dropout, batch normalization) is used. For the sparsity step we have used SGD with η = 10 3 and λ = 10 5 for the specific sensitivity, λ = 10 6 for the unspecific sensitivity. |