Geometric Transformers for Protein Interface Contact Prediction
Authors: Alex Morehead, Chen Chen, Jianlin Cheng
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In rigorous benchmarks, Deep Interact, on challenging protein complex targets from the 13th and 14th CASP-CAPRI experiments as well as Docking Benchmark 5, achieves 14% and 1.1% top L/5 precision (L: length of a protein unit in a complex), respectively. In doing so, Deep Interact, with the Geometric Transformer as its graph-based backbone, outperforms existing methods for interface contact prediction in addition to other graph-based neural network backbones compatible with Deep Interact, thereby validating the effectiveness of the Geometric Transformer for learning rich relational-geometric features for downstream tasks on 3D protein structures. |
| Researcher Affiliation | Academia | Alex Morehead, Chen Chen, & Jianlin Cheng Department of Electrical Engineering & Computer Science University of Missouri Columbia, MO 65211, USA {acmwhb,chen.chen,chengji}@missouri.edu |
| Pseudocode | No | The paper describes the architecture and operations of the Geometric Transformer and interaction module but does not provide any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Training and inference code as well as pre-trained models are available at https://github.com/Bioinfo Machine Learning/Deep Interact |
| Open Datasets | Yes | Keeping this in mind, for our training and validation datasets, we chose to use DIPS-Plus (Morehead et al. (2021)), to our knowledge the largest feature-rich dataset of protein complexes for protein interface contact prediction to date. |
| Dataset Splits | Yes | To expedite training and validation and to constrain memory usage, beginning with all remaining complexes not chosen for testing, we filtered out all complexes where either chain contains fewer than 20 residues and where the number of possible interface contacts is more than 2562, leaving us with an intermediate total of 26,504 complexes for training and validation. In DIPS-Plus, binary protein complexes are grouped into shared directories according to whether they are derived from the same parent complex. As such, using a per-directory strategy, we randomly designate 80% of these complexes for training and 20% for validation to restrict overlap between our cross-validation datasets. After choosing these targets for testing, we then filter out complexes from our training and validation partitions of DIPS-Plus that contain any chain with over 30% sequence identity to any chain in any complex in our test datasets. This threshold of 30% sequence identity is commonly used in the bioinformatics literature (Jordan et al. (2012), Yang et al. (2013)) to prevent large evolutionary overlap between a dataset s cross-validation partitions. However, most existing works for interface contact prediction do not employ such filtering criteria, so the results reported in these works may be over-optimistic by nature. In performing such sequence-based filtering, we are left with 15,618 and 3,548 binary complexes for training and validation, respectively. |
| Hardware Specification | Yes | The OLCF houses the Summit compute cluster. Summit, launched in 2018, delivers 8 times the computational performance of Titan s 18,688 nodes, using only 4,608 nodes. Like Titan, Summit has a hybrid architecture, and each node contains multiple IBM POWER9 CPUs and NVIDIA Volta GPUs all connected with NVIDIA s high-speed NVLink. Each node has over half a terabyte of coherent memory (high bandwidth memory + DDR4) addressable by all CPUs and GPUs plus 800GB of non-volatile RAM that can be used as a burst buffer or as extended memory. To provide a high rate of I/O throughput, the nodes are connected in a non-blocking fat-tree using a dual-rail Mellanox EDR Infini Band interconnect. We used the Summit compute cluster to train all our models. |
| Software Dependencies | Yes | In addition, we used Python 3.8 (Van Rossum & Drake (2009)), Py Torch 1.7.1 (Paszke et al. (2019)), and Py Torch Lightning 1.4.8 (Falcon (2019)) to run our deep learning experiments. |
| Experiment Setup | Yes | For all experiments conducted with Deep Interact, we used 2 layers of the graph neural network chosen for the experiment and 128 intermediate GNN and CNN channels to restrict the time required to train each model. For the Geometric Transformer, we used an edge geometric neighborhood of size n = 2 for each edge such that each edge s geometric features are updated by their 4-nearest incoming edges. In addition, we used the Adam optimizer (Kingma & Ba (2014)), a learning rate of 1e 3, a weight decay rate of 1e 2, a dropout (i.e., forget) rate of 0.2, and a batch size of 1. We also employed 0.5-threshold gradient value clipping and stochastic weight averaging (Izmailov et al. (2018)). With an early-stopping patience period of 5 epochs, we observed most models converging after approximately 30 training epochs on DIPS-Plus. For our loss function, we used weighted cross entropy with a positive class weight of 5 to help the network overcome the large class imbalance present in interface prediction. All Deep Interact models employed 14 layers of our dilated Res Net architecture described in Section 4.3 and had their top-k metrics averaged over three separate runs, each with a different random seed (standard deviation of top-k metrics in parentheses). |