Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Towards Multiscale Graph-based Protein Learning with Geometric Secondary Structural Motifs

Authors: Shih-Hsin Wang, Yuhao Huang, Taos Transue, Justin Baker, Jonathan Forstater, Thomas Strohmer, Bao Wang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we demonstrate that integrating baseline GNNs into our multiscale framework remarkably improves prediction accuracy and reduces computational cost across various benchmarks. ... 5 Numerical Experiments: We evaluate the effectiveness and efficiency of our proposed Secondary Structure-based Hierarchical Graph (SSHG) learning framework on two benchmark protein modeling tasks: enzyme reaction classification [15] and protein-ligand binding affinity (LBA) prediction [39, 27].
Researcher Affiliation	Academia	1Department of Mathematics and Scientific Computing and Imaging (SCI) Institute University of Utah, Salt Lake City, UT 84102, USA 2Department of Mathematics, UCLA, Los Angeles, CA 90095, USA 3Department of Mathematics, UC Davis, Davis, CA 95616, USA
Pseudocode	No	The paper describes the GNN architecture and message passing equations (e.g., equations 1, 2, 3), and briefly outlines the steps for constructing hierarchical graphs and segmenting secondary structures. It also describes the DSSP algorithm in Appendix D, but this is a pre-existing algorithm, not pseudocode for the authors' proposed method. There is no explicitly labeled pseudocode or algorithm block for the proposed framework.
Open Source Code	Yes	Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We have submitted the code and data as a supplementary file.
Open Datasets	Yes	Reaction dataset. For the reaction classification task, 3D structures of 37,428 proteins corresponding to 384 enzyme commission (EC) numbers are obtained from the Protein Data Bank, with EC annotations for each protein retrieved from the SIFTS database [7]. ... LBA dataset. Following [18], we perform ligand binding affinity predictions on a subset of the commonly-used PDBbind refined set [39, 27].
Dataset Splits	Yes	Reaction dataset. ... The dataset is divided into 29,215 proteins for training, 2,562 for validation, and 5,651 for testing. Each EC number is represented across all three splits, and protein chains sharing more than 50% sequence similarity are grouped. ... LBA dataset. ... The curated dataset of 3,507 complexes is split into train/val/test splits based on a 30% sequence identity threshold to verify the model generalization ability for unseen proteins.
Hardware Specification	Yes	All models are implemented using Py Torch Geometric [10] and trained on NVIDIA RTX 3090 GPUs. ... All are conducted on a single NVIDIA Ge Force RTX 3090 24 GB.
Software Dependencies	No	All models are implemented using Py Torch Geometric [10] and trained on NVIDIA RTX 3090 GPUs. ... The implementation of our methods is based on Py Torch and Pytorch Geometric, and all models are trained with the Adam optimizer. ... mamba_ssm, a highly optimized CUDA C++ implementation. No specific version numbers for PyTorch, PyTorch Geometric, or CUDA are provided.
Experiment Setup	Yes	Experiment Setup: All models are implemented using Py Torch Geometric [10] and trained on NVIDIA RTX 3090 GPUs. To mitigate overfitting, we follow [37] and apply Gaussian noise (std = 0.1) and anisotropic scaling in the range [0.9, 1.1] to the node coordinates in both the original graph framework and SSHG framework. Additionally, we randomly mask amino acid types and secondary structure types with probabilities of 0.1 or 0.2. ... The number of message passing blocks, hidden channels, and dropout rates used for training SSHG on different tasks are listed in Table 6. The hyperparameter searching space for training is shown in Table 7.