Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Surface-VQMAE: Vector-quantized Masked Auto-encoders on Molecular Surfaces

Authors: Fang Wu, Stan Z. Li

ICML 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	extensive experiments on diverse real-life scenarios including binding site scoring, binding affinity prediction, and mutant effect estimation demonstrate its effectiveness.
Researcher Affiliation	Collaboration	1School of Engineering, Westlake University. Correspondence to: Stan Z. Li <EMAIL>.
Pseudocode	No	The paper describes the methods in narrative text and mathematical formulas but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https: //github.com/smiles724/VQMAE.
Open Datasets	Yes	The unlabeled data for pretraining Surface-VQMAE is procured from PDB-REDO (Joosten et al., 2014).
Dataset Splits	Yes	These clusters are further randomly divided into the training, validation, and test sets by 95%/0.5%/4.5%, respectively.
Hardware Specification	Yes	We implement all experiments on 4 A100 GPUs, each with 80G memory.
Software Dependencies	No	The paper mentions the use of an Adam optimizer and the Ke Ops library, but it does not specify version numbers for these or other key software components like Python or PyTorch.
Experiment Setup	Yes	During the pretraining stage, Surface-VQMAE is trained with an Adam optimizer (Kingma & Ba, 2014) with a weight decay of 5.e 3 and with β1 = 0.9 and β2 = 0.999. A Reduce LROn Plateau scheduler is employed to automatically adjust the learning rate with a patience of 5 epochs and a minimum learning rate of 1.e 7. The batch size is set to 32 and an initial learning rate is 1.e 4. The maximum iterations are 200K with warmingup iterations of 10K and the validation frequency is 1K iterations. The random seed is fixed as 2023.