End-to-End Full-Atom Antibody Design

Authors: Xiangzhe Kong, Wenbing Huang, Yang Liu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on epitope-binding CDR-H3 design, complex structure prediction, and affinity optimization demonstrate the superiority of our endto-end framework and full-atom modeling.
Researcher Affiliation Academia 1Dept. of Comp. Sci. & Tech., Institute for AI, BNRist Center, Tsinghua University 2Institute for AI Industry Research (AIR), Tsinghua University 3Gaoling School of Artificial Intelligence, Renmin University of China 4Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China. Correspondence to: Wenbing Huang <hwenbing@126.com>, Yang Liu <liuyang2011@tsinghua.edu.cn>.
Pseudocode No The paper describes its methods through prose and mathematical equations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Codes for our dy MEAN are available at https://github.com/THUNLP-MT/dy MEAN.
Open Datasets Yes We train all models on the Structural Antibody Database (SAb Dab, Dunbar et al., 2014) retrieved in November 2022, and assess them with the RAb D benchmark (Adolf-Bryfogle et al., 2018) composed of 60 diverse complexes selected by domain experts. We split SAb Dab into the training and validation sets with a ratio of 9 : 1 according to CDR-H3 clusters as suggested by Jin et al. (2021); Kong et al. (2022).
Dataset Splits Yes We split SAb Dab into the training and validation sets with a ratio of 9 : 1 according to CDR-H3 clusters as suggested by Jin et al. (2021); Kong et al. (2022). The antibodies in the same clusters as the test set are dropped to maintain a convincing generalization test. We implement the clustering process with MMseqs2 (Steinegger & S oding, 2017) and the numbers of antibodies (clusters) in the training and the validation sets are 3,256 (1,644) and 365 (182).
Hardware Specification Yes We train dy MEAN by Adam optimizer with the data-parallel framework of Py Torch on 2 Ge Force RTX 2080 Ti GPUs.
Software Dependencies No The paper mentions software like PyTorch, Adam optimizer, OpenMM, and Fold X, but does not provide specific version numbers for these dependencies.
Experiment Setup Yes We set the initial lr = 1 10 3 and decay the learning rate exponentially to reach 1 10 4 at the last step. The batch size is 16, which is consistent across different tasks. It takes 200 epochs for dy MEAN to converge in the tasks of epitope-binding CDR-H3 design and affinity optimization, while the number is 250 in the task of complex structure prediction.