reproducibilityindex.ai

Pruning Randomly Initialized Neural Networks with Iterative Randomization

Authors: Daiki Chijiwa, Shin'ya Yamaguchi, Yasutoshi Ida, Kenji Umakoshi, Tomohiro INOUE

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also empirically demonstrate the parameter efﬁciency in multiple experiments on CIFAR-10 and Image Net. The code is available at https://github.com/dchiji-ntt/iterand.
Researcher Affiliation	Industry	NTT Computer and Data Science Laboratories, NTT Corporation NTT Social Informatics Laboratories, NTT Corporation
Pseudocode	Yes	Algorithm 1: Weight-pruning optimization by edge-popup [23] ... Algorithm 2: Pseudo code of Train Mask ... Algorithm 3: Weight-pruning optimization with iterative randomization (Ite Rand)
Open Source Code	Yes	The code is available at https://github.com/dchiji-ntt/iterand.
Open Datasets	Yes	We used two vision datasets: CIFAR-10 [12] and Image Net [25].
Dataset Splits	Yes	We randomly split the 50k training images into 45k for actual training and 5k for validation. ... We randomly split the training images into 99 : 1, and used the former for actual training and the latter for validating models.
Hardware Specification	Yes	All of our experiments were performed with 1 GPU (NVIDIA GTX 1080 Ti, 11GB) for CIFAR-10 and 2 GPUs (NVIDIA V100, 16GB) for Image Net.
Software Dependencies	No	The paper does not explicitly list multiple key software components with their specific version numbers that are directly stated as being used for their implementation. It references 'torch.nn.init Py Torch 1.8.1 documentation' but this is a citation, not a direct statement of software dependencies.
Experiment Setup	Yes	In the experiments for Ite Rand and edge-popup, we used the sparsity rate of p = 0.5 for Conv6 and p = 0.6 for Res Net18 and Res Net34. ... We ﬁx Kper = 300 on CIFAR-10 (about 1 epoch) and Kper = 1000 on Image Net (about 1/10 epochs) in our experiments (Section 5). ... r = 0.1 works well with various network architectures on CIFAR-10 ... The details of the network architectures and hyperparameters for training are given in Appendix B.