DiSCO: Diffusion Schrödinger Bridge for Molecular Conformer Optimization
Authors: Danyeong Lee, Dohoon Lee, Dongmin Bang, Sun Kim
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through comprehensive evaluations and analyses, we establish the strengths of our framework, substantiating the application of the Schr odinger bridge for molecular conformer optimization. First, our approach consistently outperforms four baseline approaches, producing conformers with higher diversity and improved quality. Then, we show that the intermediate conformers generated during our diffusion process exhibit valid and chemically meaningful characteristics. We also demonstrate the robustness of our method when starting from conformers of diverse quality, including those unseen during training. Lastly, we show that the precise generation of low-energy conformers via our framework helps in enhancing the downstream prediction of molecular properties. |
| Researcher Affiliation | Collaboration | Danyeong Lee1, Dohoon Lee2, 3, Dongmin Bang1, 4, Sun Kim1, 4, 5, 6 1Interdisciplinary Program in Bioinformatics, Seoul National University 2Bioinformatics Institute, Seoul National University 3BK21 FOUR Intelligence Computing, Seoul National University 4AIGENDRUG Co., Ltd. 5Department of Computer Science and Engineering, Seoul National University 6Interdisciplinary Program in Artificial Intelligence, Seoul National University |
| Pseudocode | Yes | Please refer to Alg. 1 and Alg. 2 for detailed training and generation procedures. |
| Open Source Code | Yes | The code is available at https://github.com/Danyeong-Lee/Di SCO. |
| Open Datasets | Yes | We used GEOM-QM9 and GEOM-Drugs, widely used benchmark datasets in molecular conformer generation. We follow the same dataset split as Xu et al. (2021) and Shi et al. (2021). |
| Dataset Splits | Yes | Each dataset is made up of 40,000 molecules for training, 5,000 molecules for validation, and 200 molecules for testing. |
| Hardware Specification | No | The paper states: "The ICT at Seoul National University provided research facilities for this study." This is too general and does not specify any particular hardware components like GPU models or CPU types. |
| Software Dependencies | No | The paper mentions: "Our network implementation utilizes the e3nn library (Geiger and Smidt 2022)" and "The x TB package (Bannwarth, Ehlert, and Grimme 2019) is employed". While these specify the software, they do not include specific version numbers for the libraries/packages, which is required for reproducibility. |
| Experiment Setup | No | The paper mentions: "The search space includes the number of diffusion steps (options: 5, 7, 10, or 15), noise scheduling (quadratic or sigmoid) and the number of training epochs (100 or 250). A complete description of the hyperparameter settings and the specifics of noise scheduling can be found in the Appendix." While hyperparameters are mentioned, the specific values or a complete description are deferred to the Appendix, not provided in the main text. |