DEEP NEURAL NETWORK INITIALIZATION WITH SPARSITY INDUCING ACTIVATIONS
Authors: Ilan Price, Nicholas Daultry Ball, Adam Christopher Jones, Samuel Chun Hei Lam, Jared Tanner
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments verify the theory and show that the proposed magnitude clipped sparsifying activations can be trained with training and test fractional sparsity as high as 85% while retaining close to full accuracy. |
| Researcher Affiliation | Academia | Ilan Price*, , Nicholas Daultry Ball*, Samuel C.H. Lam*, Adam C. Jones* & Jared Tanner* (*) Mathematical Institute, University of Oxford ( ) The Alan Turing Institute {ilan.price,nicholas.daultryball,samuel.lam,adam.c.jones,tanner} @maths.ox.ac.uk |
| Pseudocode | No | The paper does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not include an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We train both feedforward networks (abridged as DNNs) of width 300 and depth 100 using CRe LUτ,m and CSTτ,m to classify digits from the MNIST dataset and, similarly, CNNs with 300 channels in each layer and depth 50 are trained to classify images from the CIFAR10 dataset. |
| Dataset Splits | Yes | For both MNIST and CIFAR10, 10% of the training set was held out as the validation set. |
| Hardware Specification | Yes | Experiments were run on a single V100 GPU |
| Software Dependencies | No | The paper states 'implemented using Pytorch Lightning' but does not specify version numbers for PyTorch, Lightning, or any other critical software dependencies. |
| Experiment Setup | Yes | The networks are initialized at the Eo C using q = 1, before being trained by stochastic gradient descent (SGD) for 200 epochs with learning rate of 10 4 and 10 3 for the DNN and CNN respectively. |