Learn2Hop: Learned Optimization on Rough Landscapes

Authors: Amil Merchant, Luke Metz, Samuel S Schoenholz, Ekin D Cubuk

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we show that learned optimizers can offer a substantial improvement over classical algorithms for these sorts of global optimization problems. Using several canonical problems in atomic structural optimization, we demonstrate that the learned optimizers outperform their classical counterparts when trained and tested on similar systems and more surprisingly are able to generalize to unseen systems. Table 2. Learned optimizers show improvement across all tested potential types.
Researcher Affiliation Industry 1Google Research, Mountain View, California, USA 2This work was done as part of the Google AI Residency Program (https://research.google/careers/ai-residency/). Correspondence to: Amil Merchant <amilmerchant@google.com>, Ekin D. Cubuk <cubuk@google.com>.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code Yes Code is available at https: //learn2hop.page.link/github.
Open Datasets No The paper refers to well-known empirical potentials (Lennard-Jones, Gupta, Stillinger-Weber) for atomic systems and mentions that JAX-MD was used to code them, but it does not provide concrete access information (specific links, DOIs, or repositories) to pre-collected datasets or data files.
Dataset Splits No The paper describes using random initializations for experiments but does not specify explicit train/validation/test dataset splits with percentages, sample counts, or predefined partition files.
Hardware Specification Yes The associated training and evaluation utilized V100 GPUs. For distributed training, the controller batches computation on up to 8 GPUs.
Software Dependencies No The paper mentions that 'The aforementioned potentials are coded using JAX-MD (Schoenholz & Cubuk, 2019). The learned optimizers are built in JAX (Bradbury et al., 2018)', but it does not provide specific version numbers for these software components.
Experiment Setup Yes For each inner-loop, we apply 50000 optimization steps before computing meta-gradients. Batched training occurs with 80 random initializations of atomic structure problems. Once meta-gradients are averaged, a central controller meta-updates the parameters of the learned optimizer via Adam with a learning rate of 10-2 (which decays exponentially by 0.98 every 10 steps). This repeats for a total of 1000 meta-updates.