Contact-Distil: Boosting Low Homologous Protein Contact Map Prediction by Self-Supervised Distillation

Authors: Qin Wang, Jiayang Chen, Yuzhe Zhou, Yu Li, Liangzhen Zheng, Sheng Wang, Zhen Li, Shuguang Cui4620-4627

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show Contact-Distil outperforms previous state-of-the-arts by large margins on CAMEO-L dataset for low homologous PCMP, i.e., around 13.3% and 9.5% improvements against Alphafold2 and MSA Transformer respectively when MSA count less than 10.
Researcher Affiliation Collaboration 1 The Chinese University of Hong Kong(Shenzhen) 2 The Chinese University of Hong Kong 3 The Future Network of Intelligence Institute (FNii) 4 Shenzhen Research Institute of Big Data 5 Shanghai Zelixir Biotech
Pseudocode Yes Algorithm 1: Contact-Distil for PCMP
Open Source Code Yes The source code and dataset are released 2. https://github.com/qinwang-ai/Contact-Distil
Open Datasets Yes Two public-available datasets are utilized to examine the performance of Contact-Distil with other approaches. tr Rosetta It is first proposed in (Yang et al. 2020)... CAMEO-L We randomly downsample proteins of previous half-year in 2021 on CAMEO dataset (Haas et al. 2013)...
Dataset Splits Yes We randomly divide the 15051 proteins into training set and validation set according to the ratio 8:2 respectively.
Hardware Specification Yes 4 Nvidia V100 GPU cards are utilized to optimize the model.
Software Dependencies No The Contact-Distil is implemented by Pytorch1 and Ignite.
Experiment Setup Yes The initial learning rates (LR) for MSA transformer and contact predictor are 10 5 and 10 4 respectively, and a cosine learning rate schedule with 2-epoch warming up steps.