Learn2Hop: Learned Optimization on Rough Landscapes
Authors: Amil Merchant, Luke Metz, Samuel S Schoenholz, Ekin D Cubuk
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we show that learned optimizers can offer a substantial improvement over classical algorithms for these sorts of global optimization problems. Using several canonical problems in atomic structural optimization, we demonstrate that the learned optimizers outperform their classical counterparts when trained and tested on similar systems and more surprisingly are able to generalize to unseen systems. Table 2. Learned optimizers show improvement across all tested potential types. |
| Researcher Affiliation | Industry | 1Google Research, Mountain View, California, USA 2This work was done as part of the Google AI Residency Program (https://research.google/careers/ai-residency/). Correspondence to: Amil Merchant <amilmerchant@google.com>, Ekin D. Cubuk <cubuk@google.com>. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures). |
| Open Source Code | Yes | Code is available at https: //learn2hop.page.link/github. |
| Open Datasets | No | The paper refers to well-known empirical potentials (Lennard-Jones, Gupta, Stillinger-Weber) for atomic systems and mentions that JAX-MD was used to code them, but it does not provide concrete access information (specific links, DOIs, or repositories) to pre-collected datasets or data files. |
| Dataset Splits | No | The paper describes using random initializations for experiments but does not specify explicit train/validation/test dataset splits with percentages, sample counts, or predefined partition files. |
| Hardware Specification | Yes | The associated training and evaluation utilized V100 GPUs. For distributed training, the controller batches computation on up to 8 GPUs. |
| Software Dependencies | No | The paper mentions that 'The aforementioned potentials are coded using JAX-MD (Schoenholz & Cubuk, 2019). The learned optimizers are built in JAX (Bradbury et al., 2018)', but it does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | For each inner-loop, we apply 50000 optimization steps before computing meta-gradients. Batched training occurs with 80 random initializations of atomic structure problems. Once meta-gradients are averaged, a central controller meta-updates the parameters of the learned optimizer via Adam with a learning rate of 10-2 (which decays exponentially by 0.98 every 10 steps). This repeats for a total of 1000 meta-updates. |