Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Contact-Distil: Boosting Low Homologous Protein Contact Map Prediction by Self-Supervised Distillation
Authors: Qin Wang, Jiayang Chen, Yuzhe Zhou, Yu Li, Liangzhen Zheng, Sheng Wang, Zhen Li, Shuguang Cui4620-4627
AAAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show Contact-Distil outperforms previous state-of-the-arts by large margins on CAMEO-L dataset for low homologous PCMP, i.e., around 13.3% and 9.5% improvements against Alphafold2 and MSA Transformer respectively when MSA count less than 10. |
| Researcher Affiliation | Collaboration | 1 The Chinese University of Hong Kong(Shenzhen) 2 The Chinese University of Hong Kong 3 The Future Network of Intelligence Institute (FNii) 4 Shenzhen Research Institute of Big Data 5 Shanghai Zelixir Biotech |
| Pseudocode | Yes | Algorithm 1: Contact-Distil for PCMP |
| Open Source Code | Yes | The source code and dataset are released 2. https://github.com/qinwang-ai/Contact-Distil |
| Open Datasets | Yes | Two public-available datasets are utilized to examine the performance of Contact-Distil with other approaches. tr Rosetta It is ο¬rst proposed in (Yang et al. 2020)... CAMEO-L We randomly downsample proteins of previous half-year in 2021 on CAMEO dataset (Haas et al. 2013)... |
| Dataset Splits | Yes | We randomly divide the 15051 proteins into training set and validation set according to the ratio 8:2 respectively. |
| Hardware Specification | Yes | 4 Nvidia V100 GPU cards are utilized to optimize the model. |
| Software Dependencies | No | The Contact-Distil is implemented by Pytorch1 and Ignite. |
| Experiment Setup | Yes | The initial learning rates (LR) for MSA transformer and contact predictor are 10 5 and 10 4 respectively, and a cosine learning rate schedule with 2-epoch warming up steps. |