Path-Aware and Structure-Preserving Generation of Synthetically Accessible Molecules

Authors: Juhwan Noh, Dae-Woong Jeong, Kiyoung Kim, Sehui Han, Moontae Lee, Honglak Lee, Yousung Jung

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4. Experiments
Researcher Affiliation Collaboration 1Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology, Republic of Korea 2LG AI Research, Republic of Korea 3Department of Information and Decision Sciences, University of Illinois Chicago, USA 4Graduate School of Artificial Intelligence, Korea Advanced Institute of Science and Technology, Republic of Korea.
Pseudocode No The paper describes the model architecture and sampling steps in text and diagrams (Figure 2, Figure 3), but it does not include a structured pseudocode or algorithm block explicitly labeled as such.
Open Source Code No The paper does not provide concrete access to source code for the methodology described, nor does it include a specific repository link or explicit code release statement.
Open Datasets Yes To construct the reaction database used for training, we used a set of commercially available 150,000 molecules obtained from Gottipati et al. (2020) with 58 reaction templates collected from Hartenfeller et al. (2011) relevant to drug discovery.
Dataset Splits Yes For optimization of model parameters, we randomly sampled 2 M of reaction sequences for model training 95 %) and validation (5 %) among the total 3 M reaction sequences.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper mentions software like Pytorch and RDKit but does not specify their version numbers, which is required for reproducibility.
Experiment Setup Yes For the model training, we empirically set the hyperparameters. In detail, we used α = 0.3, and applied cyclic annealing (Fu et al., 2019) for KL-divergence term by changing the value of β from 0 to 0.1. ... we set the number of cycles as 4 (i.e. repeating four times of annealing) during training epoch of 500. ... with a learning rate of 0.00005.