Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

CIDD: Collaborative Intelligence for Structure-Based Drug Design Empowered by LLMs

Authors: Bowen Gao, Yanwen Huang, Yiqiao Liu, Wenxuan Xie, Bowei He, Haichuan Tan, Wei-Ying Ma, Ya-Qin Zhang, Yanyan Lan

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On the Cross Docked2020 benchmark, CIDD consistently improves drug-likeness metrics, including QED, SA, and MRR, across different base generative models, while maintaining competitive binding affinity. Notably, it raises the combined success rate (balancing drug-likeness and binding) from 15.72% to 34.59%, more than doubling previous results. These findings demonstrate the value of integrating knowledge reasoning with geometric generation to advance AI-driven drug design. 39th Conference on Neural Information Processing Systems (Neur IPS 2025). 4 Experiments
Researcher Affiliation Academia Bowen Gao1,2 , Yanwen Huang3 , Yiqiao Liu3, Wenxuan Xie4, Bowei He5, Haichuan Tan1, Wei-Ying Ma1, Ya-Qin Zhang1, Yanyan Lan1,6,7 1Institute for AI Industry Research (AIR), Tsinghua University, Beijing, China 2Department of Computer Science and Technology, Tsinghua University, Beijing, China 3Department of Pharmaceutical Science, Peking University, Beijing, China 4School of Future Technology, South China University of Technology, Guangzhou, China 5Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China 6Beijing Frontier Research Center for Biological Structure, Tsinghua University, Beijing, China 7Beijing Academy of Artificial Intelligence (BAAI), Beijing, China
Pseudocode Yes Appendix D Algorithm for MRR and AUR The complete calculation process for assessing the reasonability of a molecule is outlined in Algorithm 1. Algorithm 1: Evaluation of Molecular Reasonability Input: Molecule object (mol) Output: Molecular Reasonability (MRR) and Atom Unreasonable Ratio (AUR) Step 1: Detect Carbonyl and Imine Group Carbons foreach bond in mol do if bond is double and one atom is carbon, the other is oxygen or nitrogen then Record the carbon atom in carbonyl/imine groups.
Open Source Code Yes 3Code is available at https://github.com/bowen-gao/CIDD/.
Open Datasets Yes Evaluated on the Cross Docked2020 dataset [9], CIDD significantly outperforms state-of-the-art baselines, boosting the overall success ratio from 15.72% to 34.59%.
Dataset Splits Yes We follow prior 3D-SBDD settings and use the Cross Docked2020 dataset [9], adopting the same train/test split as Target Diff [12], resulting in 100 protein pockets for test.
Hardware Specification Yes The 3D model sampling is performed using a single NVIDIA A100 GPU.
Software Dependencies Yes To further evaluate the physicochemical and pharmacokinetic properties of the generated molecules, we employ Qik Prop [30], a tool recognized for its robust performance in predicting molecular drug-likeness properties [16]. The assessed properties include aqueous solubility, lipophilicity, polar surface area (PSA), the number of metabolizable sites, and oral absorption. Detailed requirements for each property are provided in Appendix E. [...] [30] Schrödinger, LLC. Qik Prop. Schrödinger, LLC, New York, NY, 2025. Schrödinger Release 2021-2.
Experiment Setup Yes CIDD Settings. We use Mol CRAFT in the SBIG step. All modules in the LEDD step are powered by GPT-4o. The Design Module generates 5 candidates per round, and the Selection Module selects the final molecule. For each protein pocket, we generate 10 molecules. All SBIG models are trained on Cross Docked2020 with their released weights.