Lottery Tickets in Evolutionary Optimization: On Sparse Backpropagation-Free Trainability

Authors: Robert Tjarko Lange, Henning Sprekeler

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply iterative magnitude pruning (Han et al., 2015, IMP; figure 1 left) to the ES setting and establish the existence of highly sparse evolvable initializations. They consistently exist across different ES, architectures (multi-layer perceptrons/MLP, & convolutional neural networks/CNN) and tasks (9 control & 3 vision tasks).
Researcher Affiliation Academia 1Technical University Berlin, Berlin, Germany 2Science of Intelligence Cluster of Excellence. Correspondence to: Robert Tjarko Lange <robert.t.lange@tu-berlin.de>.
Pseudocode Yes The procedure is summarized in Algorithm 1. We note that (Blundell et al., 2015) previously considered a SNR criterion in the context of zero-shot pruning of Bayesian neural networks. Algorithm 1 SNR-Based Iterative Pruning for ES
Open Source Code Yes The code is publicly available under https://github.com/Robert TLange/es-lottery.
Open Datasets Yes We focus on 12 tasks... The environments are implemented by the Brax (Freeman et al., 2021) package... Next, we evolve CNN architectures on both the standard MNIST, Fashion-MNIST (F-MNIST) and Kuzushiji-MNIST (K-MNIST) digit classification tasks.
Dataset Splits No The paper mentions evaluating on 'test episodes' and using 'test accuracy', as well as 'Train Eval' in its hyperparameter tables, but it does not specify a distinct 'validation' dataset split or its methodology.
Hardware Specification Yes Simulations were conducted on a high-performance cluster... Brax tasks & MLP policy: 1 NVIDIA V100S GPU, ca. 10 hours. Pendulum task & MLP network: 1 NVIDIA RTX 2080Ti GPU, ca. 1 hour. MNIST/F-MNIST/K-MNIST task & CNN network: 1 NVIDIA RTX 2080Ti GPU, ca. 2 hours.
Software Dependencies No The paper lists software like JAX, Matplotlib, Seaborn, and NumPy with citations indicating their publication years, but it does not provide specific semantic version numbers (e.g., Python 3.8, PyTorch 1.9) for reproducibility.
Experiment Setup Yes For each task-network-ES combination we run the ES-adapted pruning procedure and prune at each iteration 20 percent of the remaining non-pruned weights (p = 0.2). ... We refer the interested reader to the supplementary information (SI) for task-specific hyperparameters. ... C. Hyperparameter Settings (Tables 2, 3, 4, 5 provide detailed settings like Population, Generations, Batch Size, Learning rate, Network Type, etc.)