Pruning Randomly Initialized Neural Networks with Iterative Randomization
Authors: Daiki Chijiwa, Shin'ya Yamaguchi, Yasutoshi Ida, Kenji Umakoshi, Tomohiro INOUE
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also empirically demonstrate the parameter efficiency in multiple experiments on CIFAR-10 and Image Net. The code is available at https://github.com/dchiji-ntt/iterand. |
| Researcher Affiliation | Industry | NTT Computer and Data Science Laboratories, NTT Corporation NTT Social Informatics Laboratories, NTT Corporation |
| Pseudocode | Yes | Algorithm 1: Weight-pruning optimization by edge-popup [23] ... Algorithm 2: Pseudo code of Train Mask ... Algorithm 3: Weight-pruning optimization with iterative randomization (Ite Rand) |
| Open Source Code | Yes | The code is available at https://github.com/dchiji-ntt/iterand. |
| Open Datasets | Yes | We used two vision datasets: CIFAR-10 [12] and Image Net [25]. |
| Dataset Splits | Yes | We randomly split the 50k training images into 45k for actual training and 5k for validation. ... We randomly split the training images into 99 : 1, and used the former for actual training and the latter for validating models. |
| Hardware Specification | Yes | All of our experiments were performed with 1 GPU (NVIDIA GTX 1080 Ti, 11GB) for CIFAR-10 and 2 GPUs (NVIDIA V100, 16GB) for Image Net. |
| Software Dependencies | No | The paper does not explicitly list multiple key software components with their specific version numbers that are directly stated as being used for their implementation. It references 'torch.nn.init Py Torch 1.8.1 documentation' but this is a citation, not a direct statement of software dependencies. |
| Experiment Setup | Yes | In the experiments for Ite Rand and edge-popup, we used the sparsity rate of p = 0.5 for Conv6 and p = 0.6 for Res Net18 and Res Net34. ... We fix Kper = 300 on CIFAR-10 (about 1 epoch) and Kper = 1000 on Image Net (about 1/10 epochs) in our experiments (Section 5). ... r = 0.1 works well with various network architectures on CIFAR-10 ... The details of the network architectures and hyperparameters for training are given in Appendix B. |