Learning Multimodal Graph-to-Graph Translation for Molecule Optimization

Authors: Wengong Jin, Kevin Yang, Regina Barzilay, Tommi Jaakkola

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our model on multiple molecular optimization tasks and show that our model outperforms previous state-of-the-art baselines.
Researcher Affiliation Academia Wengong Jin, Kevin Yang, Regina Barzilay, Tommi Jaakkola Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology {wengong, regina, tommi}@csail.mit.edu; yangk@mit.edu
Pseudocode Yes Algorithm 1 Adversarial Scaffold Regularization
Open Source Code Yes Code and data are available at https://github.com/wengong-jin/iclr19-graph2graph
Open Datasets Yes we extracted 99K and 79K translation pairs respectively from the ZINC dataset (Sterling & Irwin, 2015; Jin et al., 2018) for training. We extracted a training set of 88K molecule pairs with similarity constraint δ = 0.4. With similarity constraint δ = 0.4, we derived a training set of 34K molecular pairs from ZINC and the dataset collected by Olivecrona et al. (2017).
Dataset Splits No On the penalized log P task... We use their validation and test sets for evaluation. For each task, we ensured that all molecules in validation and test set had never appeared during training. While it mentions using validation sets from another source and ensuring test/validation sets are distinct from training, it doesn't provide specific split percentages, absolute counts, or a detailed splitting methodology within this paper for all datasets, especially not for the QED and DRD2 tasks where only train and test set sizes are given.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions software like 'Adam optimizer', 'RDKit' (with a citation but no version number), and 'Python' (implied), but it does not provide specific version numbers for key software dependencies (e.g., 'PyTorch 1.9', 'RDKit 2020.09.1').
Experiment Setup Yes For our models, the hidden state dimension is 300 and latent code dimension |z| = 8, and we set the KL regularization weight λKL = 1/|z|. For the VSeq2Seq model, the encoder is a one-layer bidirectional LSTM and the decoder is a one-layer LSTM with hidden state dimension 600. All models are trained with the Adam optimizer for 20 epochs with learning rate 0.001. We anneal the learning rate by 0.9 for every epoch. For adversarial training, our discriminator is a three-layer feed-forward network with hidden layer dimension 300 and Leaky Re LU activation function. The discriminator is trained for N = 5 iterations with gradient penalty weight β = 10.