Generative Enzyme Design Guided by Functionally Important Sites and Small-Molecule Substrates
Authors: Zhenqiao Song, Yunlong Zhao, Wenxian Shi, Wengong Jin, Yang Yang, Lei Li
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that our Enzy Gen consistently achieves the best performance across all 323 testing families, surpassing the best baseline by 10.79% in terms of substrate binding affinity. |
| Researcher Affiliation | Academia | 1Language Technologies Institute, Carnegie Mellon University. 2Department of Chemistry, Massachusetts Institute of Technology 3Department of EECS, Massachusetts Institute of Technology. 4Broad Institute of MIT and Harvard. 5Department of Chemistry and Biochemistry, University of California Santa Barbara. |
| Pseudocode | No | The paper includes architectural diagrams (Figure 1) but does not provide any pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code, model and dataset are released at https: //github.com/Lei Li Lab/Enzy Gen. |
| Open Datasets | Yes | We further construct Enzy Bench, a dataset with 3157 enzyme families, covering all available enzymes within the protein data bank (PDB). The code, model and dataset are released at https: //github.com/Lei Li Lab/Enzy Gen. |
| Dataset Splits | Yes | Then we select 30 third-level categories for validation and testing, respectively including 428 and 323 fourth-level categories. For each of the 30 third-level categories, we we randomly split 100 PDB entries with 50 for validation and 50 for testing, while the remaining entries are utilized for training. |
| Hardware Specification | Yes | The model undergoes training for 1, 000, 000 steps using 8 NVIDIA RTX A6000 GPU cards |
| Software Dependencies | No | The paper mentions initializing parameters with '650M ESM-2 parameters (Lin et al., 2022b)' but does not provide specific version numbers for software dependencies like Python, PyTorch, or other libraries used in the implementation. |
| Experiment Setup | Yes | The hyperparameters λ/2 and K are set to 1.0 and 30, respectively. The model undergoes training for 1, 000, 000 steps... The batch size and learning rate are set to 8192 tokens and 3e-4 respectively. |