reproducibilityindex.ai

Multi-Scale Representation Learning on Proteins

Authors: Vignesh Ram Somnath, Charlotte Bunne, Andreas Krause

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test the learned representation on different tasks, (i.) ligand binding afﬁnity (regression), and (ii.) protein function prediction (classiﬁcation). On the regression task, contrary to previous methods, our model performs consistently and reliably across different dataset splits, outperforming all baselines on most splits. On the classiﬁcation task, it achieves a performance close to the top-performing model while using 10x fewer parameters.
Researcher Affiliation	Academia	Vignesh Ram Somnath Dept. of Computer Science ETH Zurich vsomnath@ethz.ch Charlotte Bunne Dept. of Computer Science ETH Zurich bunnec@ethz.ch Andreas Krause Dept. of Computer Science ETH Zurich krausea@ethz.ch
Pseudocode	No	The paper describes the architecture and processes in text format and mathematical equations, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about the release of its own source code or a link to a repository.
Open Datasets	Yes	Dataset. The PDBBIND database (version 2019) [Liu et al., 2017] is a collection of the experimentally measured binding afﬁnity data for all types of biomolecular complexes deposited in the Protein Data Bank [Berman et al., 2000].
Dataset Splits	Yes	We split the dataset into training, test and validation splits based on the scaffolds of the corresponding ligands (scaffold), or a 30% and a 60% sequence identity threshold (identity 30%, identity 60%) to limit homologous ligands or proteins appearing in both train and test sets.
Hardware Specification	No	The paper does not specify the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions various software components and models (e.g., MSMS, MPN, MLP, GCN), but does not provide specific version numbers for any of them.
Experiment Setup	No	The paper describes the model architecture and general components (e.g., K iterations of message passing) but does not provide specific hyperparameter values like learning rates, batch sizes, or optimizer settings.