Gradual Optimization Learning for Conformational Energy Minimization

Authors: Artem Tsypin, Leonid Anatolievich Ugadiarov, Kuzma Khrabrov, Alexander Telepov, Egor Rumiantsev, Alexey Skrynnik, Aleksandr Panov, Dmitry P. Vetrov, Elena Tutubalina, Artur Kadurin

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our results demonstrate that the neural network trained with GOLF performs on par with the oracle on a benchmark of diverse drug-like molecules using significantly less additional data.
Researcher Affiliation Collaboration Artem Tsypin1 , Leonid Ugadiarov2,4, Kuzma Khrabrov1, Alexander Telepov1, Egor Rumiantsev1, Alexey Skrynnik1,2, Aleksandr Panov1,2,4, Dmitry Vetrov5, Elena Tutubalina1,3,6, Artur Kadurin 1,7 1AIRI, Moscow 2FRC CSC RAS, Moscow 3Sber AI, Moscow 4MIPT, Dolgoprudny 5Constructor University, Bremen 6ISP RAS Research Center for Trusted Artificial Intelligence, Moscow 7Kuban State University, Krasnodar
Pseudocode Yes Algorithm 1 GOLF Require: training dataset D0, genuine oracle OG , surrogate oracle OS , optimizer Opt, optimization rate α, NNP f ( ; θ), number of additional OG interactions K, timelimit T, update-to-data ratio U 1: Initialize the NNPf ( ; θ) with the weights of the baseline NNP model 2: Set D Copy(D0), set t 0 3: Sample s D, and calculate its energy with OS : Eprev EMMFF s 4: repeat 5: s s + αOpt(F (s; θ)) Get next conformation using NNP 6: Calculate new energy with the OS : Ecur EMMFF s 7: if Ecur > Eprev or t T then Incorrect forces predicted in s, or T reached 8: Calculate EDFT s , F DFT s = OG (s) 9: D add s, EDFT s , F DFT s Add new data to D 10: Train f ( ; θ) on D using Eq. 2 U times 11: Set t 0 12: Sample s D, and calculate its energy with OS : Eprev EMMFF s 13: else 14: s s 15: Eprev Ecur 16: t t + 1 17: end if 18: until |D| |D0| < K
Open Source Code Yes We publish1 the source code for GOLF along with optimization trajectories datasets, training, and evaluation scripts. 1https://github.com/AIRI-Institute/GOLF
Open Datasets Yes Throughout this work, we use several subsets of nabla DFT (Khrabrov et al., 2022) dataset. Another dataset that is used in our work is SPICE (Eastman et al., 2023).
Dataset Splits No The paper mentions training on subsets of nabla DFT (D0) and SPICE (DSPICE 0), and evaluation on distinct test sets (Dtest, DSPICE test). However, it does not explicitly describe a separate validation split or how hyperparameter tuning was performed using a validation set.
Hardware Specification Yes All the experiments were carried out on a cluster with 2 Nvidia Tesla V100 and 960 Intel(R) Xeon(R) Gold 2.60Hz CPU-cores, and the total computational cost is 80 CPU-years and 1900 GPU-hours.
Software Dependencies Yes Our implementation of GOLF is based on Schnetpack2.0 (Sch utt et al., 2023). Namely, we use Schnetpack2.0 s implementation of Pai NN and the data processing pipeline.
Experiment Setup Yes Table 5: Hyperparameter values for GOLF-10k. NNP hyperparameters: Backbone Pai NN, Number of interaction layers 3, Cutoff radius 5.0 A, Number of radial basis functions 50, Hidden size (n atom basis) 128. Training hyperparameters: Number of parallel OG 120, Batch size 64, Optimizer Adam, Learning rate scheduler Cosine Annealing, Initial learning rate 1 × 10^−4, Final learning rate 1 × 10^−7, Gradient clipping value 1.0, Weight coefficient ρ 1 × 10^−2, Total number of training steps 5 × 10^5, Number of additional GO interactions K 10000, Update-to-data ratio U 50, Timelimit Ttrain 100. Conformation optimizer hyperparameters: Conformation optimizer L-BFGS, Optimization rate α 1.0, Max number of iterations in the inner cycle 5.