FlexSBDD: Structure-Based Drug Design with Flexible Protein Modeling

Authors: ZAIXI ZHANG, Mengdi Wang, Qi Liu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that Flex SBDD achieves state-of-the-art performance in generating high-affinity molecules and effectively modeling the protein s conformation change to increase favorable protein-ligand interactions (e.g., Hydrogen bonds) and decrease steric clashes.
Researcher Affiliation Academia Zaixi Zhang1,2,3, Mengdi Wang3, Qi Liu1,2 , 1: School of Computer Science and Technology, University of Science and Technology of China 2:State Key Laboratory of Cognitive Intelligence, Hefei, Anhui, China 3:Princeton University zaixi@mail.ustc.edu.cn, mengdiw@princeton.edu, qiliuql@ustc.edu.cn
Pseudocode Yes We show the pseudo codes of Flex SBDD training and generation in Algorithm 1 and 2.
Open Source Code Yes The code of the paper is provided at https://github.com/zaixizhang/Flex SBDD.
Open Datasets Yes Following previous works [65, 57], we use two popular benchmark datasets for experimental evaluations: Cross Docked and Binding MOAD. Binding MOAD dataset [34] contains around 41k experimentally determined protein-ligand complexes. ... Cross Docked dataset [25] contains 22.5 million protein-molecule pairs generated through cross-docking. ... The corresponding apo structures are obtained from Apobind [3] and the generated apo structures as described in Sec. 4.5.
Dataset Splits Yes 40k protein-ligand pairs for training, 100 pairs for validation, and 100 pairs for testing following previous work [65].
Hardware Specification Yes It takes around 36 hours on n one NVIDIA Ge Force GTX A100 GPU to complete the training.
Software Dependencies No The paper mentions using Adam optimizer and an Euler solver but does not provide specific version numbers for any software libraries or dependencies, such as PyTorch or TensorFlow.
Experiment Setup Yes To construct the protein-ligand KNN graph, we set k as 8 (each node is connected to its nearest 8 neighbors). In the default setting, we use a hidden size of 256, 128, 128, and 64 for the scalar features of nodes, scalar features of edges, vector features of nodes, and vector features of edges, respectively. The Encoder and Decoder have 6 layers respectively with the number of attention heads set as 4. The number of integration steps in flow matching is 20 for Flex SBDD. The hyperparameters for the loss function: watom, wcoord, wori, and wsc are selected based on grid search ({0.5, 1.0, 2.0, 3.0}). watom, wcoord, wori, and wsc are set to 2.0, 1.0, 1.0, and 1.0 in the default setting. To train Flex SBDD, we use the Adam [42] as our optimizer with a learning rate of 0.001, betas = (0.95, 0.999), and batch size 4 for 500k iterations.