reproducibilityindex.ai

Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models

Authors: Songtao Liu, Hanjun Dai, Yue Zhao, Peng Liu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our framework can consistently boost performance across various strategies and outperforms previous state-of-the-art top-1 accuracy by a margin of 2.5%.
Researcher Affiliation	Collaboration	1The Pennsylvania State University 2Google DeepMind 3University of Southern California. Correspondence to: Songtao Liu <skl5761@psu.edu>.
Pseudocode	Yes	Algorithm 1 CREBM Framework
Open Source Code	Yes	Code is available at https:// github.com/Songtao Liu0823/CREBM.
Open Datasets	Yes	Dataset. We use the public dataset Retro Bench (Liu et al., 2023b) for evaluation. The target molecules associated with synthetic routes are split into training, validation, and test datasets in an 80%/10%/10% ratio.
Dataset Splits	Yes	The target molecules associated with synthetic routes are split into training, validation, and test datasets in an 80%/10%/10% ratio. We have 46,458 data points for training, 5,803 for validation, and 5,838 for testing.
Hardware Specification	Yes	All the experiments of baselines are conducted on a single NVIDIA Tesla A100 with 80GB memory size.
Software Dependencies	Yes	The softwares that we use for experiments are Python 3.6.8, CUDA 10.2.89, CUDNN 7.6.5, einops 0.4.1, pytorch 1.9.0, pytorch-scatter 2.0.9, pytorch-sparse 0.6.12, numpy 1.19.2, torchvision 0.10.0, and torchdrug 0.1.3.
Experiment Setup	Yes	We employ a standard Transformer (Vaswani et al., 2017) architecture to implement Eθ (T \| mtar, c), with the target molecule serving as the input for the encoder and the starting material (right shift) as the input for the decoder. The output is the logits of the starting material (left shift) for computing Eθ. One thing we d like to point out is that Eθ is pretrained first on the target-to-starting material task, so we naturally deploy this for modeling, instead of training an encoder-only one from scratch. We also employ the standard Transformer architecture to implement the forward model, framing the task of predicting a product from starting materials as a sequence-to-sequence task. For constructing our preference dataset D, we sample 10 synthetic routes for each molecule in the training dataset. All the models in the work are trained on the NVIDIA Tesla A100 GPU. The tables in 'D.2. Hyperparameter Details' provide concrete values for max length, embedding size, encoder layers, decoder layers, attention heads, FFN hidden, dropout, epochs, batch size, warmup, and learning rate.