Reprogramming Pretrained Language Models for Antibody Sequence Infilling
Authors: Igor Melnyk, Vijil Chenthamarakshan, Pin-Yu Chen, Payel Das, Amit Dhurandhar, Inkit Padhi, Devleena Das
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Results on antibody design benchmarks show that our model on low-resourced antibody sequence dataset provides highly diverse CDR sequences, up to more than a two-fold increase of diversity over the baselines, without losing structural integrity and naturalness. |
| Researcher Affiliation | Collaboration | 1IBM Research, Yorktown Heights, NY 10598, USA. 2Georgia Institute of Technology, Atlanta, GA 30332, USA. |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github. com/IBM/Reprog BERT |
| Open Datasets | Yes | Structural Antibody Database (Sab Dab) (Dunbar et al., 2013) and Rosetta Antibody Design (Rab D) (Jin et al., 2021), and Co V-Ab Dab dataset (Raybould et al., 2021) |
| Dataset Splits | Yes | Table 2. Statistics of the Structural Antibody Database (Sab Dab) for the training, validation and test splits across the three CDRs. |
| Hardware Specification | Yes | We trained all models on a single A100 40GB GPU. |
| Software Dependencies | No | The paper mentions software like BERT, Alpha Fold, Ig Fold, and Pro Gen, but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | Learning rate 1e-5 Batch size 32 Optimizer Adam |