GeoAB: Towards Realistic Antibody Design and Reliable Affinity Maturation

Authors: Haitao Lin, Lirong Wu, Yufei Huang, Yunfan Liu, Odin Zhang, Yuanqing Zhou, Rui Sun, Stan Z. Li

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that GEOAB achieves state-of-the-art performance in CDR co-design and mutation effect predictions, and fulfills the discussed tasks effectively.
Researcher Affiliation Academia 1Zhejiang University. 2AI Lab, Research Center for Industries of the Future, Westlake University.
Pseudocode No The paper describes its algorithms and models in text and equations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes For Geo AB, our model is open to the public through https://github.com/Edapinenut/GeoAB.
Open Datasets Yes Following (Kong et al., 2023a), we use the SAb Dab (Dunbar et al., 2014) with complete antibody-antigen structures for training. [...] Following (Luo et al., 2023), we use SKEMPI2 (Jankauskait e et al., 2019) as the evaluation datasets.
Dataset Splits Yes The splits of training, validation and test sets are according to the clustering of CDRs via MMSeqs2 (Steinegger & S oding, 2017). [...] The datasets are split into three folds by structure, in which two of them are used for training and validation, and the rest are used for testing.
Hardware Specification No The paper mentions that the 'Westlake University HPC Center' provided computational resources, but it does not specify any particular hardware components such as GPU models (e.g., NVIDIA A100), CPU models, or memory details.
Software Dependencies No The paper describes the architecture and components like 'MLP', 'GAT', 'Si Lu' activation, and 'Dropout', but it does not specify versions for any programming languages, libraries (e.g., Python, PyTorch, TensorFlow), or solvers used.
Experiment Setup Yes Heterogeneous residue-level encoder is parameterized as 9 layers of heterogeneous GNNs. In each layer, the MLP is constructed by Linear + Si Lu + Linear , with Dropout probability equaling 0.1 to avoid over-fitness. The embedding dim is set as 128. [...] The learning rate lr is 5e 4. In all training, the max training epoch is 20. Lambda LR schedule is used, with lr lambda is set as 0.95 lr. [...] Batch size is set to 8, layer number is 6 and the embed size is all set to 128 for a fair comparison.