Multi-Scale Representation Learning on Proteins
Authors: Vignesh Ram Somnath, Charlotte Bunne, Andreas Krause
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test the learned representation on different tasks, (i.) ligand binding affinity (regression), and (ii.) protein function prediction (classification). On the regression task, contrary to previous methods, our model performs consistently and reliably across different dataset splits, outperforming all baselines on most splits. On the classification task, it achieves a performance close to the top-performing model while using 10x fewer parameters. |
| Researcher Affiliation | Academia | Vignesh Ram Somnath Dept. of Computer Science ETH Zurich vsomnath@ethz.ch Charlotte Bunne Dept. of Computer Science ETH Zurich bunnec@ethz.ch Andreas Krause Dept. of Computer Science ETH Zurich krausea@ethz.ch |
| Pseudocode | No | The paper describes the architecture and processes in text format and mathematical equations, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about the release of its own source code or a link to a repository. |
| Open Datasets | Yes | Dataset. The PDBBIND database (version 2019) [Liu et al., 2017] is a collection of the experimentally measured binding affinity data for all types of biomolecular complexes deposited in the Protein Data Bank [Berman et al., 2000]. |
| Dataset Splits | Yes | We split the dataset into training, test and validation splits based on the scaffolds of the corresponding ligands (scaffold), or a 30% and a 60% sequence identity threshold (identity 30%, identity 60%) to limit homologous ligands or proteins appearing in both train and test sets. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions various software components and models (e.g., MSMS, MPN, MLP, GCN), but does not provide specific version numbers for any of them. |
| Experiment Setup | No | The paper describes the model architecture and general components (e.g., K iterations of message passing) but does not provide specific hyperparameter values like learning rates, batch sizes, or optimizer settings. |