Transformer-based Objective-reinforced Generative Adversarial Network to Generate Desired Molecules
Authors: Chen Li, Chikashige Yamanaka, Kazuma Kaitoh, Yoshihiro Yamanishi
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments were performed using the ZINC chemical dataset, and the results demonstrated the usefulness of Trans ORGAN in terms of uniqueness, novelty, and diversity of the generated molecules. |
| Researcher Affiliation | Academia | Chen Li , Chikashige Yamanaka , Kazuma Kaitoh and Yoshihiro Yamanishi Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, Iizuka, Japan li260@bio.kyutech.ac.jp, yamanaka.chikashige215@mail.kyutech.jp, {kaito168, yamani}@bio.kyutech.ac.jp |
| Pseudocode | Yes | Algorithm 1: MC search under policy Gθ; Algorithm 2: Pre/training for Trans ORGAN |
| Open Source Code | No | The paper does not provide an explicit statement or a direct link to the open-source code for the methodology described. |
| Open Datasets | Yes | The test data were a subset of the ZINC chemical dataset [Ramakrishnan et al., 2014], which contains 134,000 molecules represented by SMILES strings. |
| Dataset Splits | No | The paper states it used a subset of the ZINC dataset and discusses pre-training and adversarial training, but does not provide specific train/validation/test splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper mentions using Pytorch but does not provide specific hardware details such as GPU or CPU models used for experiments. |
| Software Dependencies | Yes | All experiments were performed by using Pytorch version 1.8.1. |
| Experiment Setup | Yes | We set the dimension of the word embedding to 16 and the dropout rate to 0.2. The encoder and decoder each had four heads and two stacked layers. The generator was pre-trained over 100 epochs by maximum likelihood estimation (MLE). The dimension of the word embedding was 32 for the discriminator. We set the number of kernels as 1, 3, 5, 7, and 9; the kernel size as 20, 30, 40, 50, and 60; and the dropout rate to 0.75. In the pre-training phase, the discriminator was pre-trained over ten epochs. In addition, we set the tradeoff between maintaining the likelihood and RL as λ = 0.5. The MC search time N was set to 16. |