reproducibilityindex.ai

Most Activation Functions Can Win the Lottery Without Excessive Depth

Authors: Rebekka Burkholz

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	3 Experiments To demonstrate that our theoretical results make realistic claims, we present three sets of experiments that highlight different advantages of the L + 1-construction and the 2L-construction. In all cases, we emulate our constructive existence proofs by pruning source networks to approximate a given target network. All experiments were conducted on a machine with Intel(R) Core(TM) i9-10850K CPU @ 3.60GHz processor and GPU NVIDIA Ge Force RTX 3080 Ti. Table 1: LT pruning results on MNIST. Averages and 0.95 standard conﬁdence intervals are reported for 5 independent source network initializations. Parameters are counted in packs of 1000.
Researcher Affiliation	Academia	Rebekka Burholz CISPA Helmholtz Center for Information Security 66123 Saarbrücken, Germany burkholz@cispa.de
Pseudocode	No	The paper contains detailed proof outlines (e.g., "Proof Outline" for Theorem 2.5 and 2.6) which describe steps, but these are not formatted as pseudocode or an algorithm block.
Open Source Code	Yes	Code is available on Github (Relational ML/LT-existence).
Open Datasets	Yes	As the inﬂuential work [13], we use Iterative Magnitude Pruning (IMP) on Le Net networks with architecture [784, 300, 100, 10] to ﬁnd LTs that achieve a good performance on the MNIST classiﬁcation task [7].
Dataset Splits	No	The paper mentions training on 'MNIST classiﬁcation task' and 'tiny-Image Net training data' and evaluating on 'tiny-Image Net test data', but does not specify a validation dataset split or percentages for any splits (e.g., 80/10/10, or specific counts for validation).
Hardware Specification	Yes	All experiments were conducted on a machine with Intel(R) Core(TM) i9-10850K CPU @ 3.60GHz processor and GPU NVIDIA Ge Force RTX 3080 Ti.
Software Dependencies	No	Using the Pytorch implementation of the Gihub repository open_lth1 with MIT license, we arrive at a target network for each of four considered activation functions after 12 pruning steps: RELU, LRELU, SIGMOID, and TANH.
Experiment Setup	Yes	Using the Pytorch implementation of the Gihub repository open_lth1 with MIT license, we arrive at a target network for each of four considered activation functions after 12 pruning steps: RELU, LRELU, SIGMOID, and TANH. Their performance and number of nonzero parameters are reported in Table 1 in the target column alongside our results for the (L+1)-construction and our 2L construction, which achieve a similar performance.