Plant 'n' Seek: Can You Find the Winning Ticket?
Authors: Jonas Fischer, Rebekka Burkholz
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To answer these questions systematically, we derive a framework to plant and hide target architectures within large randomly initialized neural networks. For three common challenges in machine learning, we hand-craft extremely sparse network topologies, plant them in large neural networks, and evaluate state-of-the-art lottery ticket pruning methods. |
| Researcher Affiliation | Academia | Jonas Fischer Max Planck Institute for Informatics fischer@mpi-inf.mpg.de Rebekka Burkholz CISPA Helmholtz Center for Information Security burkholz@cispa.de |
| Pseudocode | Yes | We provide pseudocode and details in App. A.3. (Referring to Algorithm 1 and Algorithm 2 in Appendix A.3) |
| Open Source Code | Yes | Our code is publicly available at www.github.com/Relational ML/Plant NSeek. |
| Open Datasets | Yes | For that, we use SYNFLOW to discover a weak ticket of sparsity 0.01 from VGG16 with multishot pruning, train the weak ticket on CIFAR10, and plant it back into the network. |
| Dataset Splits | No | To assess the accuracy respectively mean squared error of the tickets and trained models, we split off 10% of the data that acts as a hold out test set. (The paper mentions using "validation sets" for convergence but does not specify their size or how they are split from the main data.) |
| Hardware Specification | No | The paper does not specify any particular hardware (CPU, GPU, RAM, etc.) used for running the experiments. |
| Software Dependencies | No | The paper mentions "Adam Kingma & Ba (2015)" as an optimizer but does not specify versions for software libraries or frameworks like PyTorch, TensorFlow, or Python itself. |
| Experiment Setup | Yes | To prune by GRASP, SNIP, SYNFLOW, MAGNITUDE, and RANDOM and train the derived tickets, we use Adam Kingma & Ba (2015) with a learning rate of 0.001. Training of the discovered tickets was done for 10 epochs across all experiments, where we could always observe a convergence of the respective score on the validation sets (accuracy or MSE). We measured loss by MSE respectively cross entropy loss and used a batch size of 32 for all experiments. |