Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Enhancing Graph Contrastive Learning for Protein Graphs from Perspective of Invariance

Authors: Yusong Wang, Shiyin Tan, Jialun Shen, Yicheng Xu, Haobo Song, Qi Xu, Prayag Tiwari, Mingkun Xu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on four different protein-related tasks demonstrate the superiority of our proposed GCL protein representation learning framework. Experimental results on four downstream proteinrelated tasks highlight the superior performance of our proposed GCL framework, demonstrating its effectiveness in learning protein representations. We summarize the performance comparison of different combinations of FCI and 3-PSI augmentations2 across four tasks in Table 9. Extra quantitative results are provided in Appendix F. We evaluate the effect of augmentation strength of our framework in 2D topology-based and 3D structure-based graph augmentations, using the optimal framework setting for each task (e.g., 3-PSIAlpha + FCI for EC, 3-PSIDiag + FCI for GO). The complete results are provided in Appendix F.6. Robustness Analysis. In practical application, the protein structures of test data may undergo structural changes in response to environmental factors such as p H, temperature, or the presence of specific ions (Wang et al., 2008). This variability poses a challenge to the model s robustness against structural fluctuations during testing. To systematically evaluate such robustness of our proposed method, we randomly select a proportion (10% to 50%) of residues within each protein sample in the test set and apply rotational transformations to these residues along with their connected segments, mimicking intrinsic structural fluctuations in proteins.
Researcher Affiliation	Academia	1Guangdong Institute of Intelligence Science and Technology, Zhuhai, China 2School of Engineering, Institute of Science Tokyo, Tokyo, Japan 3Graduate School of Medicine, The University of Tokyo, Tokyo, Japan 4School of Computer Science and Technology, Dalian University of Technology, Dalian, China 5School of Information Technology, Halmstad University, Halmstad, Sweden. Correspondence to: Mingkun Xu <EMAIL>.
Pseudocode	No	The paper describes methods using mathematical formulations (e.g., Eq. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 36, 37, 38, 39, 40) and textual descriptions, but there are no explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any specific links to a code repository, nor does it contain an explicit statement that the code is being released or will be made publicly available. It only mentions the tools used for implementation: "Our approach is implemented using the Py Torch framework 2.1.0 and the Py Torch-Geometric library 2.6.0. The CUDA version is 12.4."
Open Datasets	Yes	Tasks. Building upon the evaluation protocols of Fan et al. (2023), we assess the effectiveness of our approach across four protein-related tasks: Protein Fold Classification (FOLD), Enzyme Reaction Classification (Reaction), Gene Ontology Term Prediction (GO), and Enzyme Commission (EC) number prediction. For FOLD, we evaluate performance under three scenarios: fold, superfamily, and family classification. For GO, we assess performance across three sub-tasks: biological process (BP), molecular function (MF), and cellular component (CC) ontology term prediction. Details of tasks and datasets are provided in Appendix D.1. Protein Fold Classification. Protein fold classification is crucial for understanding the relationship between protein structure and evolution. Fold classes capture secondary structure compositions, orientations, and connection orders. Following Hermosilla et al. (2021), we conduct fold classification using the SCOPe 1.75 dataset (Hou et al., 2018), comprising 16,712 proteins across 1,195 fold classes. The 3D coordinates are derived from the SCOPe 1.75 database (Murzin et al., 1995). Enzyme Reaction Classification. This task involves classifying enzyme-catalyzed reactions based on all four levels of the Enzyme Commission (EC) number (Webb, 1992), representing a protein function classification problem. We utilize the dataset by Hermosilla et al. (2021), which includes 384 four-level Enzyme Commission classes and comprises 29,215/2,562/5,651 proteins for training/validation/test, respectively. Gene Ontology Term Prediction. This multi-label classification task predicts protein functions through Gene Ontology terms, organized into three hierarchical ontologies: biological process (BP, 1,943 classes), molecular function (MF, 489 classes), and cellular component (CC, 320 classes). Using the dataset from (Gligorijevi c et al., 2021), we train/validate/test on 29,898/3,322/3,415 proteins, respectively. Enzyme Commission Number Prediction. This multi-label task predicts three-level and four-level EC numbers across 538 classes. Using the dataset from Gligorijevi c et al. (2021), we train/validate/test on 15,550/1,729/1,919 proteins, respectively.
Dataset Splits	Yes	Enzyme Reaction Classification. This task involves classifying enzyme-catalyzed reactions based on all four levels of the Enzyme Commission (EC) number (Webb, 1992), representing a protein function classification problem. We utilize the dataset by Hermosilla et al. (2021), which includes 384 four-level Enzyme Commission classes and comprises 29,215/2,562/5,651 proteins for training/validation/test, respectively. Mean accuracy is the evaluation metric. Gene Ontology Term Prediction. This multi-label classification task predicts protein functions through Gene Ontology terms, organized into three hierarchical ontologies: biological process (BP, 1,943 classes), molecular function (MF, 489 classes), and cellular component (CC, 320 classes). Using the dataset from (Gligorijevi c et al., 2021), we train/validate/test on 29,898/3,322/3,415 proteins, respectively. The Fmax metric (Fan et al., 2023) is used for evaluation. Enzyme Commission Number Prediction. This multi-label task predicts three-level and four-level EC numbers across 538 classes. Using the dataset from Gligorijevi c et al. (2021), we train/validate/test on 15,550/1,729/1,919 proteins, respectively. The Fmax metric is applied for evaluation. For GO term and EC number prediction, we adhere to the multi-cutoff splits from Gligorijevi c et al. (2021), ensuring the test set only includes PDB chains with a sequence identity 95% to the training set, consistent with Zhang et al. (2023); Fan et al. (2023).
Hardware Specification	Yes	Our approach is implemented using the Py Torch framework 2.1.0 and the Py Torch-Geometric library 2.6.0. The CUDA version is 12.4. It is trained on RTX 3090 GPUs.
Software Dependencies	Yes	Our approach is implemented using the Py Torch framework 2.1.0 and the Py Torch-Geometric library 2.6.0. The CUDA version is 12.4.
Experiment Setup	Yes	Our approach is implemented using the Py Torch framework 2.1.0 and the Py Torch-Geometric library 2.6.0. The CUDA version is 12.4. It is trained on RTX 3090 GPUs. We utilize the SGD optimizer with a momentum of 0.9, a learning rate of 1e-2, and a weight decay of 5e-4. The initial graph construction radius is set to 4. All results are averaged over 5 runs with different random seeds. The training batch sizes for each task are as follows: FOLD (16), Reaction (32), GOCC (32), GOMF (24), GOBP (24), and EC (32). We simply set the weight of GCL loss to lambda=1 in our objective function throughout our experiments. For the augmentation strength of FCI, we set ϵ = 0.2 for GO, EC, and Reaction, and ϵ = 0.3 for FOLD. For the augmentation strength of 3-PSIAlpha, we set 2 rotations for EC, GOBP, and FOLD, and 3 rotations for GOCC and Reaction. For 3-PSIDiag, we perturb all dihedral angles.